In a previous post, we talked about how you can use Loominus Teraport’s GUI Designer to build a powerful data pipeline to transform raw data and engineer new features for machine learning. In this post, we’ll discuss how you can use the Teraport API for event ingestion to collect and store data for analytics in near real time. The Teraport API for event ingestion is tuned to handle streaming data and is perfect for use cases involving increasing volumes, varieties and types of data from numerous disparate data systems such as social media, web, mobile and sensors (IoT data).
We are excited to announce the addition of TensorFlow to the Loominus machine learning Provider Plugin architecture! Loominus has taken the complexity out of hand coding TensorFlow models and made it easier for anyone to experiment with neural networks on classification and regression problems. In this post we’ll discuss how to use TensorFlow for Classification and Regression models in Learner.Read More
The response to our launch of Loominus Public has been incredible! We are busy keeping up with all the invites and registering users as fast as possible. While our public beta program continues, we love that our users help shape the future of Teraport and Learner. This post describes some of the areas where our users contribute.
Loominus has three pricing plans (more details to be published soon):
- Public – Free to with shared data and models
- Teams – Data and models private to your team
- Enterprise – Dedicated infrastructure and support
We listen to our users and encourage everyone to use the platform and interact with us as you find bugs or have feature ideas.Read More
He demonstrates how to create a data staging and reporting table. (Have a look at how to do feature engineering in Teraport.) Then he trains and compares binary classification models from Scikit-Learn, XGBoost and LightGBM. He does this using Learner’s unified Provider Plugin architecture.
Finally, Hung selects and deploys the best model to an API endpoint using Modops. He goes on to demonstrate how you can use the API in your own applications to get predictions from the model.
One of the most common mistakes data scientists make when training machine learning models is incorrectly splitting data for training and testing. The train/test split involves splitting data during the model training and evaluation process. Usually data is divided into two parts:
- Training data set – The data used to train the model
- Testing data set – The hold out data used to test the performance of the model
Typically we reserve 70% of the data for training and 30% percent for testing. This can vary and should be adjusted depending on the volume of data, the kind of models under consideration and the purpose for modeling.Read More
Data pipelines are where most of the time is spent for those working with data because the bulk of a machine learning project involves data collection and cleaning. Loominus gives everyone the power to build the data pipelines critical to any machine learning project.
Teraport is a powerful tool within the Loominus product suite that ingests and stages data. In another post, we’ll discuss the data ingestion APIs. For now we’ll focus on building a powerful data pipeline for feature engineering.
We’re going to build a data pipeline that generates the average credit score of borrowers within a portfolio of loans. For added complexity, we’ll weight the credit score by each borrower’s outstanding revolving credit balance. Finally, we’ll group by loans that are either on time or delinquent and aggregate the weighted credit scores. The result will be a weighted average credit score for on time loans and delinquent loans.Read More