Amazon Web Services (AWS) has revealed a bevy of data-focused services consisting of what the cloud giant considers to be part of a ‘modern data strategy’.
According to AWS vice president of database, analytics and machine learning Swami Sivasubramanian the three elements that make up a modern data strategy include future proofing data foundations, weaving connective tissue and democratising data across an organisation.
“All of them play a critical role in helping you do more with your data,” he said at re:Invent. “If I can leave you with one thing today, please remember, it's individuals who ultimately create these bars, but it is the responsibility of leaders to empower them with a data driven culture to help them get there.”
"Without a data strategy that is built for tomorrow, organisations won't be able to make decisions that are key to gaining a competitive edge," he added.
As such, Sivasubramanian touched on a variety of services related to data during his presentation, one of which being the general availability launch of Amazon DocumentDB Elastic Clusters, which can elastically scale document databases to handle virtually any number of writes and reads with petabytes of storage capacity.
A range of new features for existing AWS services were also revealed. One such new feature was Amazon Athena for Apache Spark, enabling users to run Apache Spark workloads, use Jupyter Notebook as an interface to perform data processing on Athena and interact with Spark applications through Athena application programming interfaces (API).
Entering into preview mode in the US, is geospatial machine learning (ML) support for Amazon SageMaker, with the cloud giant stating that users can access readily available geospatial data sources, process and enrich large-scale geospatial datasets with purpose-built operations, and accelerate model building by selecting pretrained ML models.
The predictions can then be analysed and explored through an interactive map and be able to share and collaborate on results.
Also entering into preview across the North Virginia, Ohio, Oregon, Tokyo, Ireland and Stockholm regions is Amazon Redshift’s ability to support running data warehouses in multiple AWS Availability Zones (AZ) simultaneously.
Redshift Multi-AZ deployments can be accessed as a single data warehouse with one endpoint, with the cloud giant claiming this allows users to recover in case of AZ failures without any user intervention.
The preview of centralised access controls for Redshift data sharing was also brought to these six regions in preview, which can share live data across Redshift data warehouse via the enablement of AWS Lake Formation to centrally manage permissions on data being shared across an organisation.
Further, Redshift now supports auto-copy from Amazon S3 in a preview form in the six previously mentioned regions, with users being able to set up continuous file ingestion rules to track Amazon S3 paths and automatically load new files without the need for additional tools or custom solutions.
Amazon GuardDuty RDS Protection for Amazon Aurora was also announced for preview in the North Virginia, Ohio, Oregon, Tokyo and Ireland regions for no additional cost, which profiles and monitors access activity to existing and new databases in a user’s account while using tailored ML models to detect suspicious logins to Aurora databases.
Additionally, the same five regions received access to the AWS Glue Data Quality preview, which the cloud giant describes as a new capability that can automatically measure and monitor data lake and data pipeline opportunity.
Similarly, ML governance tools for Amazon SageMaker also received a mostly global launch, which included Role Manager to define minimum permissions for users and Model Cards to create a single source of truth for model information by centralising and standardising documentation through the model lifecycle.
Additionally, the third tool launched was the Model Dashboard, which enables unified monitoring across all models, which can flag deviations from expected behaviour, automated alerts and troubleshooting.
AWS also revealed 22 data connectors for Amazon AppFlow, bringing the total number to over 50, as well as raising the number of third-party applications as data sources for SageMaker Data Wrangler to over 40.
Sivasubramanian also revealed the AWS Machine Learning University with a free educator enablement program for higher education, which prioritises US community colleges, Minority Serving Institutions (MSI) and Historically Black Colleges and Universities (HBCU).
The program, which is currently unknown if it will be expanded globally, allows educators to launch courses, certificates and degrees in data management, ML and artificial intelligence (AI).
AWS CEO Adam Selipsky placed challenges in the current business climate as top of mind during his own keynote to launch various new and updated services.
Sasha Karen attended AWS re:Invent 2022 in Las Vegas as a guest of AWS.