A few of the results achieved -
About the organization:
GetYourGuide operates a leading online platform and marketplace for people to discover and book sightseeing tours, tickets for attractions and other experiences around the world. With Knoldus expertise they are now able to ingest massive volumes of data for downstream machine learning that powers their personalized online marketplace.
Challenge: Slow data pipelines struggle with big data
GetYourGuide’s primary mission is to leverage data and AI to power a personalized shopping experience for their customers. However, their struggles with data engineering impacted their ability to efficiently ingest massive volumes of data per day for downstream machine learning. Processing over 600 GB of data per day and over 50,000 activities for customers to choose from. Slow ETL pipelines blocked data science progress. Their primary ETL pipeline would run for more than five hours which impacted their ability to innovate the customer experience. Single node processing caused scalability issues required to support an explosion of data.
Solution: Unlocking data science innovation with scalable data pipelines
Knoldus provides GetYourGuide with a unified data analytics platform that has fostered a scalable and collaborative environment across data science and engineering, allowing data engineers and scientists to seamlessly combine exploratory workloads with production pipelines in the same environment without adding infrastructure complexity. Fully managed platforms on AWS enabled them to utilize native tools such as S3 as their file system. Automated cluster management simplifies the infrastructure and operations at any scale. Collaborative notebook environment with support for multiple languages (SQL, Scala, Python, R) enables a diverse team of users to work together in their preferred language.
Results: Improved operational efficiencies boost business impact
With Knoldus' solution, GetYourGuide was able to supercharge their data processing while being able to handle larger workloads. This newfound flexibility opened the doors to machine learning innovations to power their recommendation and relevance engine.