In the following example, we train a model using Ray: When using Ray, you get a cluster that enables you to distribute your computation across multiple CPUs and machines. Ray is a large ecosystem of applications, libraries and tools dedicated to machine learning such as distributed scikit-learn, XGBoost, TensorFlow, PyTorch, etc. Ray is an open source framework that provides a simple, universal API for building distributed systems and tools to parallelize machine learning workflows. Merlin’s architecture, and Merlin Workspaces in particular, are enabled by one of our core components- Ray. Our users can then use their Merlin Workspace from Jupyter Notebooks to prototype their work, or orchestrate it through Airflow or Oozie. We built the Merlin API as a consolidated service to allow the creation of Merlin Workspaces on demand. Behind the scenes, Merlin Workspaces are actually Ray clusters that we deploy on our Kubernetes cluster, and are designed to be short lived for batch jobs, as processing only happens for a certain amount of time. These dedicated environments also enable distributed computing and scalability for the machine learning tasks that run on them. With Merlin, each use case runs in a dedicated environment that can be defined by its tasks, dependencies and required resources - we call these environments Merlin Workspaces. Merlin uses these features and datasets as inputs to the machine learning tasks it runs, such as preprocessing, training, and batch inference. The data and features are then saved to our data lake or Pano, our feature store. Typically, large scale data modeling and processing at Shopify happens in other parts of our data platform, using tools such as Spark. Merlin gives our users the tools to run their machine learning workflows. Merlin Architecture A high level diagram of Merlin’s architecture Flexibility: users can use any libraries or packages they need for their modelsįor the first iteration of Merlin, we focused on enabling training and batch inference on the platform.Fast Iterations: tools that reduce friction and increase productivity for our data scientists and machine learning engineers by minimizes the gap between prototyping and production.Scalability: robust infrastructure that can scale up our machine learning workflows.Merlin’s objective is to enable Shopify's teams to train, test, deploy, serve and monitor machine learning models efficiently and quickly. Using open source tooling end-to-end was important to us because we wanted to both draw from and contribute to the most up-to-date technologies and their communities as well as provide the agility in evolving the platform to our users’ needs. Our new machine learning platform is based on an open source stack and technologies. We dive into the architecture, working with the platform, and a product use case. In this post, we walk through how we built Merlin, our magical new machine learning platform. The platform should be flexible enough to support the different aspects of building machine learning solutions in production, and enable our data scientists to use the best tools for the job. We need a machine learning platform that can handle different (often conflicting) requirements, inputs, data types, dependencies and integrations. External use cases are merchant and buyer facing, and include projects such as product categorization and recommendation systems.Īt Shopify we build for the long term, and last year we decided to redesign our machine learning platform. Internal use cases are being developed and used in specialized domains like fraud detection and revenue predictions. There are many different kinds of machine learning use cases at Shopify, internal and external. Shopify's machine learning platform team builds the infrastructure, tools and abstracted layers to help data scientists streamline, accelerate and simplify their machine learning workflows.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |