Data pipelines are where most of the time is spent for those working with data because the bulk of a machine learning project involves data collection and cleaning. Loominus gives everyone the power to build the data pipelines critical to any machine learning project.

Teraport is a powerful tool within the Loominus product suite that ingests and stages data. In another post, we’ll discuss the data ingestion APIs. For now we’ll focus on building a powerful data pipeline for feature engineering.

We’re going to build a data pipeline that generates the average credit score of borrowers within a portfolio of loans. For added complexity, we’ll weight the credit score by each borrower’s outstanding revolving credit balance. Finally, we’ll group by loans that are either on time or delinquent and aggregate the weighted credit scores. The result will be a weighted average credit score for on time loans and delinquent loans.

Read More