Posts

Showing posts from April, 2022

10 predictions for Customer Data Platforms (CDPs) in 2030

Image
  How up-and-coming trends will transform CDPs by 2030 I’ve learned the hard way over the course of my career in software and data that designing and managing a great customer experience is hard. No two customer experiences are the same and they each have their unique background stories. Whenever we try to simplify and model experiences linearly, we always fall short. Customers rarely follow “one path”; every customer arrives at your product or service with a different context, a different struggle and through a different medium. Recent developments in Customer Data Platform (CDP) — from the likes of  Segment  and  Rudderstack  — are starting to address these issues. But I think there’s still a lot more to come as we move closer to the end goal of offering a great, delightful customer experience. In this post, I’ll paint a picture of what I hope and expect to see from CDPs by 2030. I’ll explain what a CDP is and provide extra context on the data landscape, befor...

Machine Learning With Spark

Image
  his is a comprehensive tutorial on using the Spark distributed machine learning framework to build a scalable ML data pipeline. I will cover the basic machine learning algorithms implemented in Spark MLlib library and through this tutorial, I will use the PySpark in python environment. Image by Author using Canva.com Machine learning is getting popular in solving real-world problems in almost every business domain. It helps solve the problems using the data which is often unstructured, noisy, and in huge size. With the increase in data sizes and various sources of data, solving machine learning problems using standard techniques pose a big challenge. Spark is a distributed processing engine using the MapReduce framework to solve problems related to big data and processing of it. Spark framework has its own machine learning module called MLlib. In this article, I will use pyspark and spark MLlib to demonstrate the use of machine learning using distributed processing. Readers will ...