I burned some rubber (and some cash!) on batch prediction, but there's still some gas in the tank and tires ain't totally dead yet, so let's do a few more laps and see how the real-time thing works :)
Assuming you've already built and evaluated a model, one single operation is required to perform real-time prediction: create a real-time endpoint, i.e. a web service URL to send your requests to.
Of course, we could to this in the AWS console, but why not use the CLI instead? A word of warning: at the time of writing, the CLI package available from the AWS website suffers from a nasty bug on prediction calls (see https://forums.aws.amazon.com/thread.jspa?messageID=615018). You'll need to download and install the latest version from Github:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
% wget https://github.com/aws/aws-cli/archive/develop.zip | |
% unzip develop.zip | |
% cd aws-cli-develop | |
% sudo python setup.py install | |
% aws configure |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
✗ aws machinelearning describe-ml-models | |
{ | |
"Results": [ | |
{ | |
"Status": "COMPLETED", | |
"SizeInBytes": 5821257, | |
"Name": "ML model: Data3.txt", | |
"TrainingParameters": { | |
"sgd.l2RegularizationAmount": "1e-6", | |
"sgd.maxMLModelSizeInBytes": "104857600", | |
"sgd.maxPasses": "10", | |
"algorithm": "sgd", | |
"sgd.l1RegularizationAmount": "0.0" | |
}, | |
"CreatedByIamUser": "MY_ARN", | |
"EndpointInfo": { | |
"PeakRequestsPerSecond": 0, | |
"EndpointStatus": "NONE" | |
}, | |
"MLModelId": "ml-noV4tkTZgnO", | |
"InputDataLocationS3": "s3://jsimon-logs-us/data3.txt", | |
"LastUpdatedAt": 1429171168.23, | |
"TrainingDataSourceId": "5d07396a-3a43-4bc2-87f2-bd8308f22f85", | |
"CreatedAt": 1429170603.421 | |
} | |
] | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
✗ aws machinelearning describe-evaluations | |
{ | |
"Results": [ | |
{ | |
"EvaluationDataSourceId": "9f8bc29a-54d6-49a5-897c-089fbe7abf01", | |
"Status": "COMPLETED", | |
"Name": "Evaluation: ML model: Data3.txt", | |
"InputDataLocationS3": "s3://jsimon-logs-us/data3.txt", | |
"EvaluationId": "ev-ZYV6BJeN4rQ", | |
"CreatedByIamUser": "MY_ARN", | |
"MLModelId": "ml-noV4tkTZgnO", | |
"LastUpdatedAt": 1429171423.973, | |
"PerformanceMetrics": { | |
"Properties": { | |
"RegressionRMSE": "357.07353237287543" | |
} | |
}, | |
"CreatedAt": 1429170603.766 | |
} | |
] | |
} | |
✗ aws machinelearning get-evaluation --evaluation-id ev-ZYV6BJeN4rQ | |
{ | |
"EvaluationDataSourceId": "9f8bc29a-54d6-49a5-897c-089fbe7abf01", | |
"Status": "COMPLETED", | |
"Name": "Evaluation: ML model: Data3.txt", | |
"InputDataLocationS3": "s3://jsimon-logs-us/data3.txt", | |
"EvaluationId": "ev-ZYV6BJeN4rQ", | |
"CreatedByIamUser": "MY_ARN", | |
"MLModelId": "ml-noV4tkTZgnO", | |
"LastUpdatedAt": 1429171423.973, | |
"LogUri": "https://eml-prod-emr.s3.amazonaws.com/631572398268-ev-ev-ZYV6BJeN4rQ/userlog/631572398268-ev-ev-ZYV6BJeN4rQ?AWSAccessKeyId=MY_ACCESS_KEY&Expires=1429776404&Signature=GTiEfLpGOBs8I8oZPG%2FVBpf3BYM%3D", | |
"PerformanceMetrics": { | |
"Properties": { | |
"RegressionRMSE": "357.07353237287543" | |
} | |
}, | |
"CreatedAt": 1429170603.766 | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
✗ aws machinelearning create-realtime-endpoint --ml-model-id ml-noV4tkTZgnO | |
{ | |
"MLModelId": "ml-noV4tkTZgnO", | |
"RealtimeEndpointInfo": { | |
"EndpointStatus": "UPDATING", | |
"PeakRequestsPerSecond": 200, | |
"CreatedAt": 1429171982.734, | |
"EndpointUrl": "https://realtime.machinelearning.us-east-1.amazonaws.com" | |
} | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
✗ aws machinelearning describe-ml-models | |
{ | |
"Results": [ | |
{ | |
"Status": "COMPLETED", | |
"SizeInBytes": 5821257, | |
"Name": "ML model: Data3.txt", | |
"TrainingParameters": { | |
"sgd.l2RegularizationAmount": "1e-6", | |
"sgd.maxMLModelSizeInBytes": "104857600", | |
"sgd.maxPasses": "10", | |
"algorithm": "sgd", | |
"sgd.l1RegularizationAmount": "0.0" | |
}, | |
"CreatedByIamUser": "MY_ARN", | |
"EndpointInfo": { | |
"EndpointStatus": "READY", | |
"PeakRequestsPerSecond": 200, | |
"CreatedAt": 1429171982.691, | |
"EndpointUrl": "https://realtime.machinelearning.us-east-1.amazonaws.com" | |
}, | |
"MLModelId": "ml-noV4tkTZgnO", | |
"InputDataLocationS3": "s3://jsimon-logs-us/data3.txt", | |
"LastUpdatedAt": 1429171168.23, | |
"TrainingDataSourceId": "5d07396a-3a43-4bc2-87f2-bd8308f22f85", | |
"CreatedAt": 1429170603.421 | |
} | |
] | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
✗ aws machinelearning predict | |
--ml-model-id ml-noV4tkTZgnO | |
--record | |
'{"lastname":"Simon", "firstname":"Julien", | |
"gender":"M", "state":"Texas","age":"44", | |
"month":"4", "day":"106", "hour":"10", "minutes":"26", | |
"items":"5" }' | |
--predict-endpoint https://realtime.machinelearning.us-east-1.amazonaws.com | |
{ | |
"Prediction": { | |
"predictedValue": 752.190185546875, | |
"details": { | |
"PredictiveModelType": "REGRESSION", | |
"Algorithm": "SGD" | |
} | |
} | |
} |
One last question for today: how fast is this baby? I tested two different setups :
- Return trip between my office in Paris and the endpoint in us-east-1 : about 700 ms
- Return trip between an EC2 instance in us-east-1 and the endpoint in us-east-1 : about 300 ms
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Paris -> us-east-1 | |
% time aws machinelearning predict --ml-model-id ml-noV4tkTZgnO --record '{"lastname":"Simon", "firstname":"Julien", "gender":"M", "state":"Texas","age":"44", "month":"4", "day":"106", "hour":"10", "minutes":"26", "items":"5" }' --predict-endpoint https://realtime.machinelearning.us-east-1.amazonaws.com | |
{ | |
"Prediction": { | |
"predictedValue": 752.190185546875, | |
"details": { | |
"PredictiveModelType": "REGRESSION", | |
"Algorithm": "SGD" | |
} | |
} | |
} | |
aws machinelearning predict --ml-model-id ml-noV4tkTZgnO --record | |
0,22s user 0,06s system 40% cpu 0,694 total | |
// us-east-1 -> us-east-1 | |
[ec2-user@ip-172-30-1-166 aws-cli-develop]$ time aws machinelearning predict --ml-model-id ml-noV4tkTZgnO --record '{"lastname":"Simon", "firstname":"Julien", "gender":"M", "state":"Texas","age":"44", "month":"4", "day":"106", "hour":"10", "minutes":"26", "items":"5" }' --predict-endpoint https://realtime.machinelearning.us-east-1.amazonaws.com | |
{ | |
"Prediction": { | |
"predictedValue": 752.190185546875, | |
"details": { | |
"PredictiveModelType": "REGRESSION", | |
"Algorithm": "SGD" | |
} | |
} | |
} | |
real 0m0.312s | |
user 0m0.240s | |
sys 0m0.036s |
Slower than expected. The Amazon ML FAQ says:
Q: How fast can the Amazon Machine Learning real-time API generate predictions?
Most real-time prediction requests return a response within 100 MS, making them fast enough for interactive web, mobile, or desktop applications. The exact time it takes for the real-time API to generate a prediction varies depending on the size of the input data record, and the complexity of the data processing “recipe” associated with the ML model that is generating the predictions
300 ms feels slow, especially since my model doesn't strike me as super-complicated. Maybe I'm jaded ;)
Ok, enough bitching :) This product has only been out a week or so and it's already fuckin' AWESOME. It's hard to believe Amazon made ML this simple. If I can get this to work, anyone can.
Given the right level of price, performance and scale (all will come quickly), I see this product crushing the competition... and not only other ML SaaS providers. Hardware & software vendors should start sweating even more than they already do.
C'mon, give this thing a try and tell me you're STILL eager to build Hadoop clusters and write Map-Reduce jobs. Seriously?
Till next time, keep rockin'.
No comments:
Post a Comment