At AWS re:Invent, Amazon Web Services expanded its toolkit of machine learning application services with the announcements of Amazon Comprehend Medical, Amazon Forecast, Amazon Personalize, and Amazon Textract. These new services augment the capabilities Amazon provides to end users when it comes to text analysis, personalized recommendations, and time series forecasts. The continued growth of these individual services removes obstacles for companies looking to get started with common machine learning tasks on a smaller scale; rather than building a wholesale data science pipeline in-house, these services allow companies to quickly get one task done, and this permits an incremental introduction to machine learning for a given organization. Forecast, Personalize, and Textract are in preview, while Comprehend Medical is available now.
Amazon Comprehend Medical, Forecast, Personalize, and Textract join a collection of machine learning services that include speech recognition (Transcribe) and translation (Translate), speech-to-text and text-to-speech (Lex and Polly) to power machine conversation such as chatbots and Alexa, general text analytics (Comprehend), and image and video analysis (Rekognition).
Amazon Personalize lets developers add personalized recommendations into their apps, based on a given activity stream from that app and a corpus of what’s available to be recommended, whether that’s products, articles, or other things. In addition to recommendations, Personalize can also be used to customize search results and notifications. By combining a given search string or location with contextual behavior data, Amazon looks to provide customers with the ability to build trust.
Amazon Forecast builds private, custom time-series forecast models that predict future trends based on that data. Customers provide both histoical data and related causal data, and Forecast analyzes the data to determine the relevant factors in building its models and providing forecasts.
Amazon Textract extracts text and data from scanned documents, without requiring manual data entry or custom code. In particular, using machine learning to recognize when data is in a table or form field and treat it appropriately will save a significant amount of time over the current OCR standard.
Finally, Amazon Comprehend Medical, an extension of last year’s Amazon Comprehend, uses natural language processing to analyze unstructured medical text such as doctor’s notes or clinical trial records, and extract relevant information from this text.
Organizations doing resource planning, financial planning, or other similar forecasting that currently lack the capability to do time series forecasting in-house should consider using Amazon Forecast to predict product demand, staffing levels, inventory levels, material availability, and to perform financial forecasting. Outsourcing the need to build complex forecasting models in-house lets departments focus on the predictions.
Consumer-oriented organizations looking to build higher levels of engagement with their customers who provide generic, uncontextualized recommendations right now (based on popularity or other simple measures) should consider using Amazon Personalize to provide personalized recommendations, search results, and notifications via their apps and website. Providing high-quality relevant recommendations a la minute builds customer trust in the quality of a given organization’s engagement efforts, particularly compared to the average spray-and-pray marketing communication.
Organizations that still depend on physical documents, or who have an archive of physical documents to scan and analyze, should consider using Amazon Textract. OCR’s limits are well-known, especially when it comes to accurately interpreting and formatting semi-structured blocks of text data such as form fields and tables, resulting in significant time devoted to post-processing manual correction. Textract handles complex documents without the need for custom code or maintaining templates; being able to automate text interpretation and analysis further accelerates document processing workflows, and better permits organizations to maintain compliance.
Medical organizations using software that depends on manually-implemented rules to process their medical text should consider using Amazon Comprehend Medical. By removing the need to maintain a list of rules in-house, Comprehend Medical accelerates the ability to extract and analyze medical information from unstructured text fields like doctor’s notes and health records, improving processes such as medical coding, cohort analysis to recruit patients for clinical trials, and health monitoring of patients.
All organizations looking to use machine learning services from external providers need to consider whether outsourcing will work for their circumstances. Data privacy is a key concern, and even more so in regulated verticals with industry-specific rules such as HIPAA. Does the service you want to use respect those rules? From a compliance perspective, why a model gives the results it does needs to be explained as well; merely accepting results from the black box at face value is insufficient. Machine learning products that automatically provide such an explanation in plain English do exist, but this feature is still uncommon and in its infancy.
With its latest announcements, Amazon continues to broaden the scope of customer issues it addresses with machine learning services. Medical companies need better text analytics yesterday, but struggle to comply with HIPAA while assessing the data they have. Customer-facing organizations face stiff competition when their competitor is only a click away. And any company trying to plan for the future based on past data grapples with understanding what factors affect future results. Amazon’s machine learning application services address common tactical business issues by simplifying the process for customers of implementing task-specific machine learning models to pure inputs and outputs. These services present outsourcing opportunities for overworked departments struggling to keep up.