The top 5 enterprise analytics stories of 2021 (and a peek into 2022)

Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more

In 2021, everything from databases, to baseball, no-code AI for data scientists, graph analytics, and even events got an analytics makeover this year.

Heading into 2022, Chris Howard, the chief of research at Gartner, and his team wrote in its Leadership Vision for 2022 report on the Top 3 Strategic Priorities for Data and Analytics Leaders that “progressive data and analytics leaders are shifting the conversation away from tools and technology and toward decision-making as a business competency. This evolution will take time to achieve, but data and analytics leaders are in the best position to help orchestrate and lead this change.”

In addition, Gartner’s report predicts that adaptive governance will become more prominent in 2022: “Traditional one-size-fits-all approaches to data and analytics governance cannot deliver the value, scale, and speed that digital business demands. Adaptive governance enables data and analytics leaders to flexibly select different governance styles for differing business scenarios.”

The enterprise analytics sector this year foreshadowed much of what’s to come. Here’s a look back at the top stories in this sector from 2021, and where these themes may carry the industry towards next.

Databases get real-time analytics capabilities and integrations

Rockset integrated its analytics database with both MySQL and PostgreSQL relational databases to enable organizations to run queries against structured data in real time.

Rather than having to shift data into a cloud data warehouse to run analytics, organizations can now offload analytics processing to a Rockset database running on the same platform.

The company’s approach is designed to analyze structured relational data, as well as semi-structured, geographical, and time-series data in real time. Complex analytical queries can also be scaled to include JOINS with other databases, data lakes, or event streams. In addition to integrations with open source relational databases, the company also provides connectors to MongoDB, DynamoDB, Kafka, Kinesis, Amazon Web Services, and Google Cloud Platform, among others.

What stood out most about this advancement, though, isn’t specific to Rockset. “As the world moves from batch to real-time analytics,” the company stated in its press release, “and from analysts running manual queries to applications running programmatic queries, traditional data warehouses are falling short.” This trend in real-time analytics is further propelled by the swift move several companies made to a virtual and all-online infrastructure due to the pandemic. Real-time analytics in the virtual space will allow companies to more accurately index, strategize, and create new applications using their data.

Popular baseball analytics platform moves to the cloud

It’s well-known to baseball fans that the data now made available by the MLB goes beyond the traditional hits, runs, and errors — it’s a sport both increasingly as complex in its data and statistics as it is in its ever-growing list of new time-limits and league rules.

Fans now regularly consult a raft of online sites that use this data to analyze almost every aspect of baseball: top pitching prospects, players who hit the most consistently in a particular ballpark during a specific time of day, and so on.

One of those sites is FanGraphs, which has transitioned the SQL relational database platform it relies on to process and analyze structured data to a curated instance of the open source MariaDB database, which is deployed on the Google Cloud Platform.

FanGraphs uses the data it collects to enable its editorial teams to deliver articles and podcasts that project, for example, playoff odds for a team based on the results of the SQL queries the company crafts. These insights can assist a baseball fan participating in a fantasy league, someone who wants to place a more informed wager on a game at a venue where gambling is legalized, or a game developer creating the latest MLB The Show video game. All of the above require high volumes of data.

One of the things that attracted FanGraphs to MariaDB is the level of performance that it could attain using a database-as-a-service (DBaaS) platform.

“On top of [Maria DB’s] SkySQL’s ease and performance, the exceptional service from our SkyDBAs have enabled us to completely offload our database responsibilities. That help goes far beyond day-to-day maintenance, backup, and disaster recovery. We find our SkyDBA looks at things we wouldn’t necessarily keep an eye on to secure and optimize our operations,” David Appelman, founder, and CEO of FanGraphs stated in a press release.

The explosion of data calls for an explosion of efficiency to manage it, and it’s a trend the industry can expect to see more of heading into 2022.

Data scientists will soon get a hand from no-code analytics

SparkBeyond, a company that helps analysts use AI to generate new answers to business problems without requiring any code, released SparkBeyond Discovery.

The company aims to automate the job of a data scientist. Typically, a data scientist looking to solve a problem may be able to generate and test 10 or more hypotheses a day. With SparkBeyond’s machine, millions of hypotheses can be generated per minute from the data it leverages from the open web and a client’s internal data, the company says. Additionally, SparkBeyond explains its findings in natural language, so a no-code analyst can understand it.

The company says its auto-generation of predictive models for analysts puts it in a unique position in the marketplace of AI services. Most AI tools aim to help the data scientist with the modeling and testing process once the data scientist has already come up with a hypothesis to test.

The significance here essentially comes down to “time is money.” For example, the more time a data scientist can save solving problems and testing hypotheses, the more money a company saves in turn. “Analytics and data science teams can now leverage AI to uncover hidden insights in complex data, and build predictive models with no coding required [while leveraging the] AI-driven platform to make better business decisions, faster,” SparkBeyond stated in an October press release.

A service with the capacity to explore such a vast amount of hypotheses per minute based on internal and external data sources to reveal previously unrecognized drivers of business and scenario outcomes, and explains its findings in natural language to individuals that may not even need to code whatsoever, is quite the breakthrough in the analytics space.

Notable companies using SparkBeyond Discovery include McKinsey, Baker McKenzie, Hitachi, PepsiCo, Santander, and others.

Life is increasingly split between virtual and in-person – analytics must follow

Hubilo, a platform that helps businesses of all sizes host virtual and hybrid events and gain access to real-time data and analytics, raised $23.5 million in its series A funding round earlier this year.

Investments in companies like Hubilo that integrate tools for virtual and in-person tasks, events, meetings, and activities will likely continue into 2022 as the world enters into year two of a global pandemic. Digital conferences, meetups, and events can be scaled more easily and with fewer resources than their brick-and-mortar counterparts, and the shift to hybrid and virtual platforms generates a significant amount of data in-person events otherwise may not have, which can prove valuable to companies for tracking and correlating business objectives.

Hubilo’s promises its customers enhanced data and measurability capacities. Event organizers using Hubilo’s platform can access engagement data on visitors, including the number of logins and new users versus active users. Additionally, event sponsors can also determine whether a visitor is likely to purchase from them based on engagement with their virtual booth. Data includes the number of business cards received, profile views, file downloads, and more.

The platform can also track visitors’ activities, such as attending a booth or participating in a video demonstration, and then recommend similar activities. From a business perspective, a sponsor or sales personnel can use these features to access potential prospects through a feature Hubilo calls “potential leads.”

Its integration capabilities are also key for companies now operating in a hybrid or fully remote capacity. Hubilo features a one-click approach for common “go-to-market platforms including HubSpot, Salesforce, and Marketo, enabling companies to demonstrate ROI through event data integrated with their existing workflows,” its press release stated. Integrating analytics tools with CRM and sales platforms is a vital trend that will continue to evolve as the world navigates not to get things back in-person, but rather, if they should do so, and what they can gain from hybrid approaches and tools instead.

Graph database gets a revamp

What do the Panama Papers researchers, NASA engineers, and Fortune 500 leaders have in common? They all heavily rely on graphs and databases.

Neo4j, a graph database company that claims to have popularized the term graph database and aims to be a leader in the graph database industry, has shown signs through its growth this year that graphs are becoming a foundational part of the technology stack.

Across industry sectors, graph databases serve a variety of use cases that are both operational and analytical. A key advantage they have over other databases is their capability to intuitively generate models and rapidly generate data models and queries for highly interconnected domains. In an increasingly interconnected world, that is proving to be of value for companies.

What was then an early-adopter game has snowballed to the mainstream, and it’s still growing. “Graph Relates Everything” is how Gartner put it when including graphs in its top 10 data and analytics technology trends for 2021. At this year’s Gartner Data & Analytics Summit 2021, graphs were, unsurprisingly, front and center.

Interest from tech and data decision-makers is continuously expanding as graph data takes on a role in master data management, tracking laundered money, connecting Facebook friends, and powering the search page ranker in a dominant search engine.

With the noted increase in the volume of data that companies are now storing and processing in an increasingly digital world, tools that provide flexibility for interpreting, modeling, and using data will be key and their usage is sure to increase going forward.

According to Neo4j, that is precisely what it’s capable of providing to its users.A graph database stores nodes and relationships instead of tables, or documents. Data is stored just like you might sketch ideas on a whiteboard. Your data is stored without restricting it to a pre-defined model, allowing a very flexible way of thinking about and using it,” the press release reads.

So what’s ahead for 2022? The analytics landscape will become increasingly complex in its capabilities, while simultaneously becoming even more user-friendly for researchers, developers, data scientists, and analytics professionals alike.


  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Source: Read Full Article