Posted on Leave a comment

July 23: From BI to AI (Cube Dev, Dremio, Google Cloud, Julia Computing, Lucata, Palantir, Redpoint, Sisense, Vertica, Zoom)

If you would like your announcement to be included in these data platform-focused roundups, please email lynne@amalgaminsights.com.

Product Launches and Updates

Dremio Launches SQL Lakehouse Service to Accelerate BI and Analytics

On July 21, at Subservice Live, Dremio debuted Dremio Cloud, a cloud-native SQL-based data “lakehouse” service. The service marries various aspects of data lakes and data warehouses into a SQL lakehouse, enabling high-performance SQL workloads in the cloud and expediting the process of getting started. Dremio Cloud is now available in the AWS Marketplace.

Google Cloud Announces Healthcare Data Engine to Enable Interoperability in Healthcare

On July 22, Google Cloud announced Healthcare Data Engine, now in private preview. Healthcare Data Engine integrates healthcare and life sciences data from multiple sources such as medical records, claims, clinical trials, and research data, enabling a more longitudinal view of patient health along with advanced analytics and AI in a secure environment. With the introduction of Amazon HealthLake last week, it’s clear that expanding healthcare and life sciences analytics capabilities continue to be a top priority among data services providers.

Palantir Introduces Foundry for Builders

Dipping a toenail into the waters outside their usual large established organization customer base, Palantir announced the launch of Foundry for Builders, providing access to the Palantir Foundry platform for startups under a fully-managed subscription model. Foundry for Builders is starting off with limited availability; the initial group of startups provided access are all connected to Palantir alumni, with the hope of expanding to other early-stage “hypergrowth” companies down the road.

Redpoint Global Announces In Situ

On July 20, Redpoint announced In Situ, a service that provides data quality and identity resolution. In Situ uses Redpoint’s data management technology to supply identity resolution and data integration services in real time within an organization’s virtual private cloud, without needing to transfer said private data across the internet.

Sisense Announces Sisense Extense Framework

On July 21, Sisense debuted the Sisense Extense Framework, a way to deliver interactive analytics experiences within popular business applications. Initially supported apps include Slack, Salesforce, Google Sheets, Google Slides, and Google Chrome, now available on the Sisense Marketplace. The Sisense Extense Framework will be released more broadly later this year to partners looking to build similar “infusion” apps.

Vertica Announces Vertica 11

On June 20, at Vertica Unify 2021, Vertica announced the Vertica 11 Analytics Platform. Key improvements include broader deployment support, strengthened security, increased analytical performance, and enhanced machine learning capabilities.

Funding

Cube Dev Raises $15.5 Million to Help Companies Build Applications with Cloud Data Warehouses

On July 19, Cube Dev announced that they had raised $15.5M in Series A funding. Decibel led this round, with participation from Bain Capital Ventures, Betaworks and Eniac Ventures. The funding will be used to scale go-to-market activities and accelerate R+D on its first commercial product. Cube Dev also brought aboard Jonathan E. Cowperthwait of npm as Head of Marketing and Jordan Philips of Dashbase as Head of Revenue Operations to support their commercial expansion.

Julia Computing Raises $24M in Series A, Former Snowflake CEO Bob Muglia Joins Board

Julia Computing announced the completion of a $24M Series A funding round on July 19. Dorilton Ventures led the round, with participation from Menlo Ventures, General Catalyst, and HighSage Ventures. Julia Computing will use the funding to further develop JuliaHub, its secure, high-performance cloud platform for scientific and technical modeling, and to grow the Julia ecosystem overall. Bob Muglia, the former CEO of Snowflake, joined the Julia Computing board on the same day.

Lucata Raises $11.9 Million in Series B Funding to Introduce Next-Generation Computing Platform

Lucata, a platform to scale and accelerate graph analytics, AI, and machine learning capabilities, announced July 19 that it had raised $11.9M in Series B funding. Notre Dame, Middleburg Capital Development, Blu Ventures Inc., Hunt Holdings, Maulick Capital, and Varian Capital all participated in the round. The funding will fuel an “aggressive” go-to-market strategy.

Acquisitions

Zoom to Acquire Five9

On July 18, Zoom announced that it had entered into a definitive agreement to acquire Five9, a cloud contact center service provider, for $14.7B in stock. In welcoming Five9 to the Zoom platform, Zoom expects to build a better “customer engagement platform,” complementary with its Zoom Phone offering. Later in the week, Zoom also announced the launch of Zoom Apps and Zoom Events, further enhancing the collaboration capabilities of the primary Zoom video communications suite.

Posted on 1 Comment

July 16: From BI to AI (AWS, CognitiveScale, GoodData, Hazelcast, Informatica, StrongBox Data Solutions, Vertica)

If you would like your announcement to be included in Amalgam Insights’ weekly data and analytics roundups, please email lynne@amalgaminsights.com.

Product Launches and Enhancements

Informatica Announces Unified Data Governance and Catalog As-a-Service in the Cloud

On July 13, Informatica announced the launch of its Cloud Data Governance and Catalog solution as a key part of its Intelligent Data Management Cloud. As part of Informatica’s continuing expansion to the cloud, Informatica customers will now be able to do data cataloging, quality, data and machine learning model governance through the same “pane of glass.” Amalgam Insights has a forthcoming post on Informatica’s integration of data governance and machine learning model governance.

AWS Announces General Availability of Amazon HealthLake

On July 15, Amazon debuted Amazon HealthLake, a data lake for healthcare and life sciences data that falls under HIPAA requirements. HealthLake uses machine learning to extract and appropriately transform unstructured health data to prepare it for use in analytics and AI models. The service is generally available now.

CognitiveScale Announces Launch Of Cortex Fabric Version 6

On July 15, CognitiveScale announced the release of Cortex Fabric Version 6, a low-code AI app development platform. Cortex 6 lets “citizen developers” build AI-based apps with their “Campaigns” visual framework, focusing on business optimization and process automation use cases.

GoodData and Vertica Partner to Accelerate Cloud-Native Self-Service Analytics Adoption in the Enterprise

GoodData and Vertica announced a strategic partnership on July 15. GoodData.CN, GoodData’s cloud native analytics services, will connect to Vertica’s data warehouse to allow non-technical users to perform self-service analytics.

Hazelcast Unveils Real-Time Intelligent Applications Platform

On July 14, Hazelcast announced the Hazelcast platform, which will allow users to merge streaming data with data at rest. Because Hazelcast can handle realtime event streams as well as access to data lakes and data warehouses, it can act as a single point of access for all types of data. Hazelcast is currently in beta; general availability is expected in August 2021. In addition, certain features of Hazelcast will also be available through the Hazelcast Platform for IBM Cloud Paks.

StrongBox Data Solutions Announces StrongLink 3.2

StrongBox Data Solutions announced the availability of StrongLink 3.2, a data management platform. StrongLink automates policy enforcement across a wide variety of data sources and storage types, aiming to eliminate data silos while providing replication to protect said data’s existence. StrongLink 3.2 is available immediately.

Posted on Leave a comment

July 9: From BI to AI (AnyVision, Google Cloud, IBM, Immuta, Obviously AI, Opaque)

If you would like your announcement to be included in Amalgam Insights’ weekly data and analytics roundups, please email lynne@amalgaminsights.com.

Funding

AnyVision Raises $235M from SoftBank Vision Fund 2 and Eldridge

AnyVision, a facial recognition AI company, has closed a $235M series C funding round led by SoftBank’s Vision Fund 2 and Eldridge. Amit Lubovsky, director of SoftBank Investment Advisors, will join the board as part of the transaction. Funding will be directed towards further development of AnyVision’s Access Point AI software, as well as further innovation of its SDKs for edge computing functionality. AnyVision’s funding announcement comes at an interesting time for facial recognition startups; concerns around data privacy are subjecting companies creating and using facial recognition to growing scrutiny.

Obviously AI Increases Seed Round Funding to $4.7M

Obviously AI, a no-code AutoML startup, has raised an additional $1.1M from the University of Tokyo Edge Capital Partners, as well as Trail Mix Ventures and B-Capital. The funding will go towards extending Obviously AI to serve more use cases, as well as expanding Obviously AI’s presence in Asian markets. The concept of no-code AI model building is the unicorn everyone dipping into data science is seeking, but Obviously AI is currently limited to supervised learning use cases, and broadening their scope to cover unsupervised learning is the next … obvious step.

Opaque Raises $9.5 Million Seed to Unlock Encrypted Data with Machine Learning

Opaque, a secure data analytics platform, announced July 7 that it had raised a seed round of $9.5M led by Intel Capital. Race Capital, The House Fund, and FactoryHQ also participated in this round. Opaque lets companies analyze encrypted cloud-based data without exposing the data to the cloud provider. Funding will go towards Opaque’s open source contributions to the data security community.

Product Launches and Updates

Immuta Becomes First Data Access Control Solution for Snowflake Partner Connect

On July 7, Immuta, a cloud data control access provider, announced its availability in the Snowflake Partner Connect portal. Snowflake users will now be able to use Immuta to configure automated data access control around their data. The Immuta Snowflake integration launches as an Immuta instance preconfigured with a Snowflake user’s connection credentials, minimizing setup complexity and time needed.

Hiring and Departing

Google announces Adaire Fox-Martin as its new EMEA Cloud president

Google Cloud has appointed Adaire Fox-Martin as its new EMEA Cloud president. Fox-Martin moves over from a 14-year tenure at SAP, most recently as an Executive Board Member leading Global Customer Success. Prior to that, Fox-Martin spent nearly two decades at Oracle.

IBM’s Jim Whitehurst Says He’s Leaving to Find a New Chance to Run Something

Over the holiday weekend, IBM announced that Jim Whitehurst would be stepping down as president, though he would remain in an advisory role for the time being. In an interview this week with Barrons, Whitehurst acknowledged that his reasoning is that he wants to be a CEO again, and with the appointment of Arvind Krishna to that spot at IBM, his own chances of holding that position were unlikely. Whitehurst had come over to IBM with the Red Hat acquisition, having held the CEO position there since 2007.

Posted on Leave a comment

July 2: From BI to AI (Anaconda, Facebook, JetBrains, Tableau, TIBCO)

In anticipation of the long holiday weekend for Americans and Canadians, news was fairly light in the data world this week; most announcements were around updates and enhancements to existing products.

If you would like your announcement to be included in Amalgam Insights’ weekly data and analytics roundups, please email lynne@amalgaminsights.com.

Product Launches and Updates

Tableau Extends Augmented Analytics in Tableau 2021.2

On June 29, Tableau announced the release of Tableau 2021.2, with new and enhanced augmented analytics capabilities. “Ask Data,” a capability that allows users to ask business questions of their data using natural language, and “Explain Data,” a function that provides explanations of data points, both have new interfaces that enhance users’ understanding of their data. Other new features in Tableau 2021.2 include the ability to save clean data from Tableau Prep into Google BigQuery, and to implement machine learning models from Amazon SageMaker within Tableau dashboards.

TIBCO Spotfire 11.4 LTS Release

TIBCO announced the release of Spotfire 11.4 LTS on June 30. Key features in this release include the ability for nontechnical users to embed advanced analytics functions into Spotfire apps, and over a dozen new custom visualizations and apps available as “Spotfire Mods” on the TIBCO Exchange.

Anaconda Collaborates with Intel to Improve Speed and Scale for Machine Learning Workflows

Anaconda announced enhancements to its ongoing partnership with Intel, including better access to libraries and packages optimized for Intel hardware to enhance the performance of machine learning models. Of note, the Intel Extension for Scikit-learn is now available in Anaconda’s package repository; Anaconda says models built using the extension run 27-36x faster than models based on the baseline Scikit-learn.

JetBrains: Announcing Datalore Enterprise

On June 29, JetBrains announced the availability of Datalore Enterprise, an on-premises collaborative version of their single-user cloud-based data science platform. Datalore Enterprise will provide JetBrains collaboration tools atop Jupyter Notebooks, along with existing features of Datalore such as PyCharm coding assistance tools.

Facebook AI Announces Habitat 2.0, plus Introducing the Habitat-Matterport 3D research data set

Finally, Facebook AI announced the latest version of their Habitat platform (Habitat 2.0), a simulation platform that lets AI researchers teach machines to navigate and interact with both virtual and physical 3D environments. Improvements include ReplicaCAD, an extension of Facebook’s Replica data set, built to support movement and object manipulation as a digital twin, In collaboration with Matterport, Facebook AI also published HM3D, an open-source licensed data set consisting of over 1,000 indoor 3D scans. (This last year, prospective property buyers couldn’t go to open houses, but they could at least investigate a given property’s digital twin, and Matterport supplied a number of these virtual house tours for property listings.) Future AI-enhanced assistants and robots will need to interact with complex 3D environments; advancing “embodied” AI will be a top priority in order to build such assistants. Suggested scenarios include asking one’s AI-enhanced glasses where your housekeys were last observed, or asking a robot to check your desk for your laptop and if it’s there, to bring it to you.

Posted on

June 25: From BI to AI (including Apache Kafka, Confluent, Dataiku, Datarobot, Domino Data Lab, Firebolt, Incorta, Palantir, Primer, Rasgo, Splunk)

If you would like your announcement to be included in Amalgam Insights’ weekly data and analytics roundups, please email lynne@amalgaminsights.com.

Product Launches and Updates

Domino 4.4 Now Available

On Tuesday, June 21, Domino Data Lab announced the availability of Domino 4.4. New capabilities include Durable Workspaces, allowing data scientists to operate with multiple environments open at once; CodeSync, enhancing Domino’s existing reproducibility capabilities with native integration with common Git repositories; and the abilities to encrypt data in transit and mount NFS volumes directly to Domino. Domino 4.4 is available for existing customers immediately.

Dataiku Launches in AWS Marketplace

On June 21, Dataiku announced its availability in the AWS Marketplace. AWS customers can now use Dataiku’s visual interface to orchestrate their data pipelines and machine learning models applied to their cloud data, and Dataiku projects based on AWS-hosted data can also incorporate AWS Machine Learning Services such as computer vision or text analytics.

Palantir, DataRobot Partner to Bring Speed and Agility to Demand Forecasting Models

DataRobot and Palantir announced a new partnership on Thursday, June 24, around solving demand forecasting problems for retailers. The new Demand Forecasting framework links Palantir Foundry with DataRobot’s Model Development and Model Deployment capabilities. Prepped data is piped directly from Foundry into DataRobot where forecasting models are trained, then brought back into Foundry for operationalization.

Splunk Launches New Security Cloud

On June 22, Splunk debuted the Splunk Security Cloud, a SecOps platform with integrated security analytics and threat intelligence and an open ecosystem to correlate data across all security tools. Splunk also announced a $1B investment from Silver Lake; the funding will go towards further growth of Splunk and its ongoing cloud transformation, as well as managing a newly authorized share repurchase program.

Funding

Firebolt Ignites Growth with a $127M Series B Funding Round

Firebolt, a cloud data warehouse company, raised $127M in Series B funding this week, following up on a $37M Series A round from December 2020. All investors from the A round participated, including Angular Ventures, Bessemer Venture Partners, TLV Partners, and Zeev Ventures, with new investors Dawn Capital and K5 Global joining the B round. Firebolt will use the funding to expand its product, engineering, and go-to-market teams.

Incorta Raises $120M in Series D Funding

Wednesday, June 23, Incorta announced a $120M Series D funding round led by Prysm Capital. Other participants included GV, Kleiner Perkins, M12, Sorenson Capital, Telstra Ventures, Wipro Ventures, and new investor National Grid Ventures. This round of funding will go towards expanding Incorta’s go-to-market operations and meeting demand for Incorta’s data analytics platform.

Primer Raises $110M Series C

Primer, a natural language processing company, raised $110M in a Series C funding round, announced on Tuesday, June 24. Lee Fixel’s Addition led the round, with participation from existing investors Amplify Partners, Avalon Ventures, Bloomberg Beta, DCVC, Lux Capital, and Section 32, as well as new investors Crumpton Ventures, J2 Ventures, Sands Capital, and Steadfast. Primer also announced two partnerships: one with Microsoft to make Primer available within Azure, as well as a partnership with Palantir to make Primer available within the Palantir platform.

Rasgo Raises $20M Series A

Rasgo, a feature store, announced that it had raised an additional $20M in funding as a Series A round. Insight Partners led the round, with participation by existing investor Unusual Ventures. Rasgo will use the funds to expand its team with a focus on engineering talent, accelerate product development, and build its go-to-market.

Confluent IPO

Confluent, a data streaming platform, had its IPO June 24, raising $828M. Even with an initial offering price of $36/share, above its intended range of $29-$33/share, shares of Confluent closed up at over $45/share by the end of the first day of trading to reach a valuation of over $11 billion, indicating the continued importance of streaming analytics in supporting two key challenges: real-time context and real-time response.

Posted on Leave a comment

June 18: From BI to AI (Altair SmartWorks, Crate.io, Dataiku, Dataiku Online, Datarobot, Neo4j, SAS, Transform)

If you would like your announcement to be included in Amalgam Insights’ weekly data and analytics roundups, please email lynne@amalgaminsights.com.

Funding

Neo4j Announces $325 Million Series F Investment, the Largest in Database History

On June 17, Neo4J announced a $325M Series F funding round. Eurazeo led the round, with participation from existing investors Creandum, Greenbridge Partners, and One Peak, as well as new participants DCTP, GV, and Lightrock. Neo4J plans to use this money along three key vectors: buffing up their multi-cloud service offerings, growing capabilities to support enhanced machine learning models in graph-based data science, and expanding their market reach. Amalgam Insights’ Hyoun Park assesses the Neo4J funding more thoroughly, and highlights the importance of graph databases as the next step in enterprise analytics, and the key role they will have in supporting the next generation of machine learning models.

Introducing Transform: a ‘metrics store’ to make data accessible

Transform, a centralized metrics store, has come out of stealth, announcing $24.5M in funding across two rounds. Index Ventures and Redpoint Ventures led the round, with participation from Fathom Capital and Work Life Ventures. Transform is looking to double their headcount with this funding. General availability of Transform is projected for Fall 2021.

Crate.io Secures $10 Million in Funding

Crate.io, the developers of the CrateDB database platform, raised $10M in additional funding, bringing their total funding up to $31M. Draper Esprit and Vito Ventures participated in this round. The funding will be used to expand sales, grow functionality and add more partner integrations, and promote the open source developer community around CrateDB.

Product Launches and Updates

Cloud-native Altair® SmartWorks™ Empowers Enterprises to Make Data-driven Decisions

On June 14, Altair debuted Altair SmartWorks, a cloud-native analytics platform. SmartWorks integrates the data prep capabilities of Altair Monarch and their machine learning and predictive analytics solution Knowledge Studio under one roof, providing access to analytics, machine learning, and IoT no matter one’s comfort level with coding. SmartWorks is available now via Altair Units, their subscription-based licensing model.

Dataiku Announces Fully Managed, Online Analytics Offering

On June 14, Dataiku launched Dataiku Online, providing cloud-based access to their machine learning platform for smaller organizations without the extensive IT departments of their larger counterparts. In particular, seed-stage companies and other young startups are eligible for highly discounted pricing. A 14-day free trial is available now. Via Dataiku Online, customers can access data storage tools from Google BigQuery, Amazon Redshift, and Snowflake, and Snowflake customers can likewise access Dataiku Online through the Snowflake Marketplace.

DataRobot 7.1 Introduces Enhancements to Take AI Projects to the Next Level

On June 15, DataRobot announced its 7.1 platform release. Key new features include MLOps Management Agents, which manage remote machine learning models’ lifecycles; the no-code AI App Builder to turn deployed models into AI-based apps without needing customers to write code; and the feature discovery integration with Snowflake, announced last week at Snowflake Summit. The 7.1 release is available now.

Hiring

SAS Names Jenn Chase as Chief Marketing Officer, Executive Vice President

SAS promoted Jenn Chase, Senior Vice President and Head of Marketing, to the Chief Marketing Officer and Executive Vice President position. Chase’s 20-year career with SAS includes time in both R+D and marketing. As SVP, Chase initiated the relaunch of the SAS brand earlier this year, and led the pandemic-induced online pivot for the two most recent SAS Global Forums.

DataRobot Expands C-Suite with New CPO, CTO, and CMO

DataRobot grew its C-Suite this week, pulling in Elise Leung Cole from Cisco to serve as the new Chief People Officer, and promoting Michael Schmidt and Nick King from within as the new CTO and CMO respectively. Cole previously was the VP & Deputy General Counsel at Cisco, leading the team supporting sales and marketing, and creating compliance, training, and career developments within the organization. Prior to her time at Cisco, Cole served as General Counsel at AppDynamics.

Schmidt came to DataRobot as the founder of Nutonian, which DataRobot acquired in 2017. He helped develop DataRobot’s Automated Time Series product, and led the partnership with the US government to assure speedy and equitable COVID-19 vaccine trials. King joined DataRobot in April as the SVP of Marketing. Prior to that, King held executive positions at Cisco, VMWare, Google, and Microsoft. The expanded CMO role puts King in charge of global marketing and brand strategy.

Posted on 1 Comment

Neo4j Takes on the Battle for Context with a $325 Million F Round

On June 17th, 2021, Neo4j, a graph database company, announced a $325 million investment led by a $100 million investment by Eurazeo and joined by new investors GV (previously named Google Ventures), DTCP, and Lightrock as well as existing investors Creandum, Greenbridge Partners, and One Peak. Eurazeo is private equity company with over 15 billion Euro in Assets Under Management as part of a larger investment portfolio of over 22 billion Euro. With this round, Eurazeo Managing Director Nathalie Kornhoff-Brüls joins the Neo4j Board of Directors.

This monster funding round speaks to the confidence that investors have in the future of Neo4j. But in this particular instance, Amalgam Insights believes that this large funding amount is especially important because of what it means for breaking the status quo of enterprise analytics.

Analytics and data management in the business world have been built around the relational database focused on controlling and governing individual data inputs. This fundamental framework has been very useful in creating an environment that can be configured to present a single shared source of truth. However, it is not especially good at supporting and processing data relationships, which is a challenge in today’s data environment as data grows quickly and data relationships increasingly represent some level of transaction or behavior aligned with a business activity that needs to be tracked or analyzed in near-real-time.

In addition, the hype regarding artificial intelligence and machine learning has finally crossed over into practical reality as the toolkits for operationalizing models have reached mainstream availability. Even as enterprises may not fully understand machine learning, but they can easily purchase access or use open source projects to access the data management, model creation, storage, and compute capabilities needed to support machine learning projects. But for companies to fully execute on the promise of machine learning, they need to create more efficient relationship-based data environments that allow models to be tested and to provide results. Building relationship-centered data is part of what I originally called the Battle for Context when Amalgam Insights was first founded.

And now four years later, Neo4j has a chance to deliver on this challenge for context at a global scale. Neo4j has been a graph data leader for years, especially since it started back in 2007 before the need for graph database management was fully clear to the enterprise market at large. Since then, Neo4j has been a stalwart in its market education of graph data. But it has fundamentally been fighting a status quo where companies have been either unwilling or unable to translate their key transactional data environments into the relationship-based models that will be necessary for broad machine learning. With this round of funding, Neo4j finally has a chance to conduct the volume of marketing and sales needed to educate the data and analytics audience. In contrast to other large rounds of funding announced in the data world, such as Snowflake’s $479 million round in February 2020 or Databricks’ $1 billion round in February 2021, Amalgam Insights believes that Neo4j’s funding round serves a slightly different purpose.

Those previously-mentioned funding rounds were all seen as final rounds of funding before an upcoming IPO with participation by software vendor partners in their ecosystem. In contrast, Neo4j both has a more foundational opportunity and challenge in that graph should be the foundation of enterprise machine learning and relationship-driven data environments, but the ecosystem and platform maturity are still not quite where the data warehouse market is. Amalgam Insights sees this round as being more similar to DataRobot’s $270 million round raised in November 2020 which allowed DataRobot to continue acquiring companies and building out its platform to fit enterprise challenges.

Ultimately, the goals that enterprises should associate with graph data are the combinations of unlocking relationships within data that will take orders of magnitude in time, money, and skillsets to discover in relational data as well as the opportunity to unlock tens to hundreds of millions of dollars in value through ongoing machine learning and artificial intelligence operationalization opportunities that have already been identified but cannot run at high-performance levels without a better data environment. The acquisition and use of graph databases is a technological bottleneck that will prevent enterprises from fully unlocking AI and we are only now reaching a point where the understanding of relationship data, training data, machine learning feedback, and transactional data is sufficient for business managers to understand the value of dedicated graph databases rather than simply placing a graph structure on relational or multimodel data.

Recommendation to the Amalgam Insights database and analytics community

At the very least, start learning about graph data structure as combinations of edges, vertices, and relationships as well as linear algebra to gain an understanding of how graph data differs from the standard high school algebraic logic of relational databases. Yes, learning math and a new set of data relationships is not as easy as downloading a library or learning a new software functionality. But graph relationships are a fundamental change in the way that data will be managed over the next couple of decades and there will be a great deal of work needed both to ETL/ELT relational data into graph databases as well as to manage graph databases for the upcoming world of AI replacing aspects of standard business analytics.

If your organization is looking at relationship analytics or machine learning initiatives beyond a single project, look at Neo4j, which currently has a dominant position as a standalone graph database and is available as open source under GNU General Public License (GPL v3).

And if you have questions about the current state of Neo4j or are trying to bridge gaps from BI to AI in your organization, please contact research@amalgaminsights.com to schedule time to speak with our analysts. We look forward to serving you in our continued role in helping you to understand the future of your data.

Posted on 2 Comments

June 11: From BI to AI: Special Snowflake Edition (Alteryx, Amazon, Dataiku, DataRobot, Domino Data Lab, Informatica, Talend, and of course, Snowflake)

This week’s “From BI to AI” update is a little different from the usual. Snowflake Summit occurred June 8-10, bringing a slew of announcements related to Snowflake’s new features, and many Snowflake partners timed their own related announcements in sync with the Summit.

Snowpark and Java UDFs

On Tuesday, June 8, Snowflake launched Snowpark, their “developer experience.” Data scientists, data engineers, and developers can build in Java or Scala within Snowpark, and then execute their workloads directly within Snowflake.

Also on the coding side, Snowflake announced support for Java UDFs (user-defined functions) within Snowflake, allowing customers to import their custom code and business logic to Snowflake. Both Snowpark and Java UDFs within Snowflake are currently in private preview, with public preview coming soon.

Snowflake also announced the Snowpark Accelerated Program, where partner vendors can access Snowflake technical experts and be provided with additional exposure to existing Snowflake customers.

Snowflake Partner Announcements

Numerous Snowflake partner vendors followed up with their own announcements on Wednesday, June 9.

Snowflake can now be used as a data source within Amazon SageMaker Data Wrangler. This integration allows data prep for machine learning in SageMaker to occur in Snowflake.

Alteryx announced a deeper integration of Alteryx with Snowflake. Alteryx Designer is now directly available on Snowflake; data prep, data blending, and automated analytics processing are pushed down into Snowflake for better performance and scalability. Joint Alteryx-Snowflake customers can also augment existing data sources with those available on the Snowflake Data Marketplace. Current Snowflake customers have access to a free trial of Alteryx within their Snowflake account.

Dataiku debuted their Snowflake integration with Snowpark and Java UDFs. Dataiku-Snowflake users will be able to push computation down to Snowflake, so that data prep and scoring can happen within Snowflake.

DataRobot’s new Snowflake integration also joins DataRobot with Snowpark, growing the existing DataRobot-Snowflake partnership. Data prep tasks from Zepl (a recent DataRobot acquisition) can be pushed into Snowflake for feature engineering, while providing a preconfigured environment for model development within Snowpark. DataRobot’s Java Scoring Code also pairs with Snowflake Java UDFs to enable DataRobot models to do scoring within Snowflake.

Domino Data Lab inaugurated its Snowflake partnership this week with Snowpark integration as well. Joint Domino-Snowflake customers will be able to build data pipelines within Snowpark, and execute MLOps workflows from Domino within Snowflake.

Building on its 2020 Partner of the Year status, Informatica announced tighter integrations between its Intelligent Data Management Cloud and Snowflake, offering support for Java UDFs for joint Informatica-Snowflake customers and advancing its mass-ingestion ELT capabilities. Users will be able to transform, cleanse, and govern data from a wide variety of enterprise applications automatically, en masse, on its way to ingestion in Snowflake.

Finally, Talend revealed Talend Trust Score for Snowflake. This new capability will allow joint Talend-Snowflake users to verify data quality within Snowflake, using Snowpark and Java UDFs.

(Sidenote: the acquisition of Talend by Thoma Bravo is proceeding apace; Thoma Bravo has begun the tender offer to acquire all outstanding ordinary shares and American Depository Shares of Talend.)

If you would like your announcement to be included in Amalgam Insights’ weekly data and analytics roundups, please email lynne@amalgaminsights.com.

Posted on 1 Comment

Alation Raises $110 Million D Round to Help Businesses Better Understand Their Data

On June 3rd, Alation announced a $110 million D round led by Riverwood Capital with participation from new investors Sanabil Investments and Snowflake Ventures. Current investors Costanoa Ventures, Dell Technologies Capital, Icon Ventures, Salesforce Ventures, Sapphire Ventures, and Union Grove Partners also contributed to the round. With this round, Alation has raised a total of $217 million and is valued at $1.2 billion.

One of the first things that stands out with this investor list is how Alation serves as an example of “investipartnering” where business partners also become limited equity partners. With Snowflake, Salesforce, and Dell all on the cap table, Alation stands out as being a strategic partner for some of the biggest cloud players on the planet. 

Here’s why this investment makes sense in today’s data environment.
One of the biggest challenges for data in 2021 is effectively governing & defining data across a wide variety of sources. Alation has been both a pioneer and now a consistent market-leading data catalog both from a revenue and functionality perspective. 

Yet, there is still a massive greenfield opportunity to rationalize taxonomies, naming conventions, integrations, and data-centric decisioning processes within the larger enterprise data ecosystem. These data challenges were already challenging enough for analytics, where businesses have had data warehouse, master data, and data integration tech for decades. But now this data also has to be prepped for machine learning & AI, where these structures are less useful. One of the reasons that a variety of industry estimates state that data scientists spend as much as 80% of their time cleansing data is because data scientists have either eschewed traditional enterprise data structures or are simply unaware of the analytic data ecosystem that has been built in enterprises over the past several decades as they seek to tackle problems.

 In an agile, “Post-Big Data” data world, the true hub of data intelligence is either at the catalog or datalake level, depending on how data is used and organized. In today’s data world, the data warehouse is an important piece of core infrastructure for enterprise data, but is not typically agile enough to support the rapid data selection, augmentation, transformation, and analysis associated with both self-service analytics. and machine learning efforts. In this modern data context, Alation is a vital player in advancing the cause of referencing, contextualizing, and linking datasets together rapidly.

And the investment by Snowflake Ventures reflects that Snowflake knows they need more control over less structured data. Snowflake is under pressure to justify its massive valuation as a cloud data leader and now has to meet the growth expectations of being worth well over $50 billion and having had a peak valuation of over $125 billion in its brief tenure as a public company. Alation will be a vital part of Snowflake’s story in providing a more agile environment for the entirety of enterprise data as Snowflake moves closer to a variety of datalake capabilities that allow for more flexible data transfer.

I’ve covered Alation since its Series A in 2015 led by Costanoa Ventures, which has now established itself as a premier early stage investor in data-driven startups, back when I was the Chief Research Officer at Blue Hill Research. Their focus on data navigation at a time when Hadoop was seen as The Big Data Answer ended up being prescient and Alation’s value is now established with over 250 enterprise clients. 

But there is a larger opportunity. As Alation has expanded from a data catalog solution to a broader data discovery, context, governance, and collaboration solution and as the challenges of data and metadata management move downmarket, Alation’s capabilities are increasingly aligned with fundamental market needs to categorize and share data effectively.

To become a global solution, Alation needs to get into thousands of organizations, a goal that I think is now realistic with this latest round of funding that both boosts sales in the short term and sets a path to ongoing scalable growth.

Recommendation for the Data Management Community

The key takeaway for the data community is that legacy data management tools typically lack the speed and functionality necessary to identify, classify, and organize new data for new analytics, machine learning, and AI use cases. This includes everything from unstructured documents to relevant binary files to time-series, graph, and geographic data. This problem has driven both the commercial and investor interest in Alation and this problem is moving downmarket as more organizations seek to start building scalable and repeatable machine learning, data science, and analytic application development environments. Organizations that are not actively planning to improve their metadata and data collaboration efforts will find themselves fundamentally hampered in trying to make the leap from BI to AI and in keeping up with the new business world of augmented, automated, process mapped, natural language-based, and iterative feedback-driven transformation. Behind all the buzzwords, companies must first understand their existing data and contextualize their new data.

Posted on Leave a comment

June 4, 2021: From BI to AI featuring Alation, Cazena, Cloudera, Datacoral, Dataiku, Interative, and Stemma

This week’s roundup From BI to AI features Alation, Cazena, Cloudera, Datacoral, Dataiku, Interative, and Stemma. If you would like your announcement to be included in Amalgam Insights’ weekly data and analytics roundups, please email lynne@amalgaminsights.com.

Acquisitions

Cloudera Acquires Datacoral and Cazena, is Acquired by Clayton, Dubilier, and Rice and KKR for $5.3 Billion

On June 1, Cloudera announced that it had agreed to be acquired by investment companies Clayton, Dubilier, and Rice, and KKR for a $5.3B sum, transitioning to a private company. Financial results for Q12021 were released at the same time, with subscription revenue up 7% year over year.

Cloudera also acquired two SaaS companies in separate transactions. Datacoral enables data transformations and data integration, while Cazena implements quick cloud data lakes. Both companies provide fully managed services that facilitate data preparation for self-service analytics.

Funding

Alation Announces $110 Million Series D to Accelerate Growth

On Thursday, June 3, Alation, an enterprise data intelligence platform announced that it had raised a $110M Series D funding round. Riverwood Capital led this round of funding. Other participants also included existing investors Costanoa Ventures, Dell Technologies Capital, Icon Ventures, Salesforce Ventures, Sapphire Ventures, and Union Grove Partners, along with new investments from Sanabil Investments and Snowflake Ventures. Amalgam Insights’ Hyoun Park wrote about this example of “investipartnering,” and provides recommendations for the data management community.

Stemma Launches, Reports Seed Funding of $4.8 Million

On Thursday, June 3, Stemma announced that it had raised $4.8M in seed funding, led by Sequoia, and subsequently officially launched their data catalog product. Built atop the open-source data catalog Amundsen, Stemma provides enterprise-scale management capabilities and an intelligence layer based on relevant context.

MLOps Company Iterative Raises $20 Million Series A Funding Led by 468 Capital

Iterative.ai, an MLOps platform, announced Wednesday, June 2 that it had raised a $20M Series A round. 468 Capital and Florian Leibert led the round, which also included prior investors True Ventures and Afore Capital. Iterative.ai also debuted its first commercial product, DVC Studio, a visual front-end on its open source projects DVC (Data Version Control) and CML (Continuous Machine Learning) intended to enhance collaboration above and beyond data scientists’ usual Git methods.

Product Launches and Updates

Dataiku Now Available in the Microsoft Azure Marketplace

On June 1, Dataiku announced availability through the Azure Marketplace. Azure customers can now purchase Dataiku with their existing Azure cloud budget and relationship, taking advantage of integrated access to Azure’s cloud storage and compute resources for their data science workflows.