writing
Notes from the pipeline
Working notes on production ML, MLOps, and the gap between a model that scores well and one a team can run.
Turning a five-script POC into a pipeline that survives Monday
A proof of concept runs once, on your machine, while you watch. Production runs unattended, forever. Here's the refactor that bridged the two for an RCM classifier.
#mlops
#pyspark
#data-engineering
Your classifier's default threshold is a bug
An AUC of 0.90 told me the model was good. A 0.5 cutoff told the business it was useless. Here's how I closed that gap on a production RCM classifier.
#ml
#production
#xgboost