-1.3 C
New York
Wednesday, February 4, 2026

Unifying governance and metadata throughout Amazon SageMaker Unified Studio and Atlan


This submit was cowritten with Satabrata Paul and Karan Singh Thakur from Atlan

On this submit, we present you unify governance and metadata throughout Amazon SageMaker Unified Studio and Atlan via a complete bidirectional integration. You’ll learn to deploy the mandatory Amazon Net Providers (AWS) infrastructure, configure safe connections, and arrange automated synchronization to keep up constant metadata throughout each platforms.

As organizations scale their knowledge and AI applications, groups typically work throughout distributed instruments comparable to governance options for enterprise customers and analytics or machine studying (ML) environments for technical groups. With out tight integration between these techniques, metadata turns into fragmented. A single asset can seem underneath totally different names, documentation may drift out of sync, and governance alerts can turn into inconsistent throughout techniques.

To handle these challenges, Atlan, a contemporary knowledge workspace that makes collaboration amongst numerous customers like enterprise, analysts, and engineers simpler, growing effectivity and agility in knowledge initiatives, and AWS have constructed a bidirectional integration between Atlan and Amazon SageMaker Unified Studio. This integration creates a steady connection between each environments so each crew throughout the enterprise can work with a single, trusted, and synchronized view of metadata for his or her knowledge and AI belongings. By bridging the hole between numerous customers collaborating in Atlan and technical groups working inside Amazon SageMaker Unified Studio for analytics and ML, this integration maintains consistency throughout each platforms with out requiring groups to modify contexts or manually reconcile metadata variations.

Why unified metadata governance issues

Enterprises immediately function in hybrid environments. Enterprise customers depend on Atlan as an energetic metadata answer to handle, govern, and collaborate on knowledge belongings throughout the trendy knowledge stack. Atlan helps groups discover, perceive, and belief their knowledge to allow them to use it successfully to drive enterprise outcomes.

Organizations additionally use Amazon SageMaker Catalog to simplify the invention, governance, and collaboration for each enterprise and technical knowledge throughout structured and unstructured sources. Groups can use the catalog to prepare knowledge merchandise, seize context, and apply governance insurance policies persistently inside Amazon SageMaker Unified Studio.

This new integration synchronizes metadata between SageMaker Catalog and Atlan, sustaining consistency and preserving content material present throughout each environments. With a unified view, each crew throughout the enterprise can work confidently with a single, trusted illustration of their knowledge and AI belongings.

Resolution overview

The answer follows a phased rollout technique to give you instant worth whereas progressively increasing towards complete knowledge and AI governance capabilities. The present part focuses on establishing safe, scalable, and dependable metadata synchronization between Atlan and Amazon SageMaker Unified Studio.

The Part 1 integration between Amazon SageMaker Catalog and Atlan permits each on-demand and scheduled bidirectional metadata synchronization throughout the 2 options. It makes use of the usual APIs of Amazon SageMaker Unified Studio and Atlan to create a scalable and configurable mechanism for metadata alternate. Key capabilities embrace:

  • Safe connection utilizing IAM roles – The combination is established via a managed AWS Identification and Entry Administration (IAM) primarily based handshake. A predefined AWS CloudFormation template robotically provisions the IAM function and insurance policies required to allow a safe, least-privilege connection between Amazon SageMaker Catalog and the Atlan software.
  • On-demand and scheduled synchronization – The combination helps each guide and automatic metadata synchronization. API-driven workflows handle the alternate of glossary phrases, asset descriptions, and classifications in each instructions, preserving metadata constant throughout techniques.

After you’ve carried out Part 1, you may carry out bidirectional synchronization of glossary phrases and descriptions between Amazon SageMaker Unified Studio and Atlan. This retains your terminology constant throughout each platforms, and your groups can preserve a single supply of reality for enterprise definitions. The combination additionally preserves your glossary buildings, together with parent-child relationships, so your fastidiously organized taxonomy stays intact throughout the sync course of. Moreover, glossary phrases are robotically related to associated knowledge belongings, saving you the guide effort of linking phrases to the suitable datasets and decreasing the danger of inconsistencies.

Past glossary administration, Part 1 permits complete ingestion of belongings and metadata from Amazon SageMaker Unified Studio into Atlan. This contains your initiatives, each revealed and subscribed belongings, domains and knowledge merchandise, glossaries and phrases, metadata types, and column descriptions. By bringing this info into Atlan, you create a unified view of your knowledge panorama that makes it simpler for knowledge customers to find, perceive, and belief the info they’re working with.

Conditions

To comply with together with this integration setup, you should have the next sources already configured in your surroundings:

  • An Atlan tenant
  • A Node group IAM function
  • An Amazon SageMaker Unified Studio area.
  • Not less than one Amazon SageMaker Unified Studio mission with belongings created and glossary phrases outlined.
  • Atlan API Token. You may generate this by navigating to API entry underneath the Atlan’s Admin middle.
  • Atlan top-level glossary. You may create this glossary container on Atlan to ingest SageMaker Unified Studio glossaries and phrases.

The subsequent part affords a step-by-step walkthrough of the combination, from preliminary setup to full operation. It demonstrates how one can set up the belief handshake between Amazon SageMaker Unified Studio and Atlan and the way bidirectional synchronization capabilities in apply.

Setup on AWS

To start the combination, you want Atlan’s Account Node Occasion IAM function. This function permits the Atlan SageMaker Unified Studio software to securely assume the IAM function that you’ll create in your AWS account utilizing an AWS CloudFormation template. The belief relationship between these two roles authorizes Atlan to publish metadata to Amazon SageMaker Catalog and to carry out reverse synchronization from AWS again into Atlan.

The IAM coverage follows the precept of least privilege, granting Atlan entry solely to the sources mandatory for cataloging and governance. This method maintains correct metadata synchronization whereas preserving your present cloud safety and compliance controls.

Comply with AWS finest practices when configuring belief relationships. These cross-account entry mechanisms require cautious administration and monitoring, significantly throughout safety incidents. For complete steering on securing IAM roles and belief insurance policies, consult with the Safety finest practices in IAM and Require workloads to make use of non permanent credentials with IAM roles to entry AWS.

Contact your Atlan administrator to acquire the Amazon Useful resource Title (ARN) of the Atlan Account Node Occasion IAM function. You’ll need this worth when configuring the CloudFormation stack in AWS.

The subsequent step is to create an AWS IAM function utilizing the offered CloudFormation template. This function establishes the belief relationship between your Amazon SageMaker Unified Studio surroundings and your Atlan tenant. Comply with these steps:

  1. Entry the CloudFormation template. The CloudFormation template is presently obtainable as a YAML file.
  2. On the AWS Administration Console, navigate to CloudFormation and select Create stack, then select With new sources (customary), as proven within the following screenshot.

  3. Select the offered CloudFormation template and select Subsequent.

  4. Enter a reputation for the stack and full the required parameters, as proven within the following screenshot:
    1. AtlanNodeInstanceRoleArn – The ARN of the Atlan node occasion function.
    2. SMUSDomainId – The distinctive identifier for the SageMaker Unified Studio area.
    3. SMUSProjectsToSync – The mission IDs the place SageMaker Unified Studio and Atlan synchronization shall be enabled. You may select to both add the mission IDs and hold updating this stack each time a Challenge is added or add the created IAM function to every mission as proprietor.

  5. Choose the acknowledgement checkbox and select Subsequent, as proven within the following screenshot.

  6. Select Submit to begin the stack deployment. When the method is full, the stack standing will replace to CREATE_COMPLETE.
  7. Be aware the IAM function ARN
  8. After the CloudFormation stack has been deployed and the IAM function has been created, copy the IAM Function ARN from the CloudFormation output. You’ll need this worth throughout the configuration course of on the Atlan facet to determine the safe connection between your Amazon SageMaker Unified Studio surroundings and your Atlan tenant.

Setup on Atlan

Now that you just’ve deployed the mandatory AWS sources, you’ll configure Atlan to determine the reference to Amazon SageMaker Unified Studio. This entails establishing the API token, configuring the IAM function, and creating the glossary container that can obtain your synchronized metadata. Comply with these steps:

  1. Register to your Atlan tenant, as proven within the following screenshot.

  2. On the New dropdown menu, select New workflow.

  3. On the Market tab, seek for and choose the AWS SageMaker Unified Studio app, as proven within the following screenshot.

  4. Enter credential particulars. Use the IAM function or person created by the CloudFormation template earlier than, enter an API token, and select your AWS Area, as proven within the following screenshot.

  5. Enter connection particulars. In Connection title, enter a reputation. Below Connection Admins, select the plus icon so as to add members (different customers) to the connectors as admins. Assigning admin permissions to the connection permits these customers to:
    1. View and edit the belongings within the connection.
    2. Edit connection preferences.
    3. Edit persona-based insurance policies for the connection.

  6. Select metadata filters and preflight checks, as proven within the following screenshot:
    • Within the Choose Glossary to counterpoint dropdown menu, select the glossary container in Atlan to be enriched with glossaries and phrases from Atlan.
    • To test for mandatory permissions required to run the workflow, choose Fast take a look at for mandatory permissions earlier than workflow run.
    • To run the workflow, select Run. To schedule it to run later, select Schedule & Run.

Synchronization of metadata

Now that you just’ve configured the combination between Atlan and Amazon SageMaker Unified Studio, let’s discover how metadata flows bidirectionally between each platforms to keep up consistency and governance throughout your knowledge panorama.

The Atlan SageMaker Unified Studio connector makes use of a bidirectional synchronization mannequin that retains enterprise context and technical metadata constant throughout each options. The method delivers reliability, traceability, and governance-safe updates, no matter the place adjustments originate. The next diagram illustrates the answer structure.

Sequential workflow for the SageMaker Unified Studio Atlan integration

The combination between SageMaker Unified Studio and Atlan follows a fastidiously orchestrated sequential workflow that allows seamless metadata synchronization throughout each platforms.

The method begins with connection setup via IAM, the place authentication and authorization are configured to determine safe entry between the client’s AWS account and Atlan’s AWS surroundings. This foundational safety layer permits subsequent knowledge exchanges to happen inside a trusted framework.

After the connection is established, the metadata sync workflow might be triggered both on an outlined schedule or manually by the person, offering flexibility primarily based on organizational wants. When triggered, the Atlan SageMaker Unified Studio app calls the SageMaker Unified Studio APIs to ingest belongings and metadata from the supply system.

The ingested belongings then bear processing and transformation inside Atlan, the place they’re transformed into Atlan’s metadata mannequin. This processing step is essential as a result of it makes the belongings discoverable, searchable, and governable contained in the Atlan platform, which suggests groups can use Atlan’s full governance capabilities.

A key functionality of this integration is its real-time reverse sync for metadata updates. When a person modifies metadata for the belongings inside Atlan (comparable to including tags or updating descriptions), Atlan’s real-time reverse sync pipelines instantly detect these adjustments and push the updates again to SageMaker Unified Studio. This retains SageMaker Unified Studio reflecting probably the most up-to-date metadata entered by customers in Atlan, eliminating the danger of metadata drift between techniques.

This bidirectional sync creates a steady loop the place metadata flows from SageMaker Unified Studio to Atlan for ingestion and publication, concurrently flowing again from Atlan to SageMaker Unified Studio via real-time reverse sync. The result’s a constant, bidirectional metadata stream that retains each platforms synchronized. Groups can work confidently understanding that their metadata governance efforts are mirrored throughout their knowledge.

The next diagram illustrates this entire workflow, displaying how metadata strikes via every stage of the combination from preliminary IAM authentication via the continual bidirectional sync loop that maintains metadata consistency throughout each platforms.

SageMaker Unified Studio to Atlan: Ingestion of metadata

The Atlan-SageMaker Unified Studio App periodically connects to SageMaker Unified Studio utilizing safe API calls to ingest metadata. This metadata is remodeled and mapped into Atlan’s metadata mannequin, then revealed via the Atlan publish app as new or up to date belongings.

Every ingestion cycle is totally logged by Atlan’s audit service, which captures timestamps, correlation IDs, and the total change file. These logs assist deduplication, troubleshooting, and replay within the occasion of partial failures.

Atlan to SageMaker Unified Studio: Synchronizing enriched enterprise context

When customers enrich belongings inside Atlan, for instance by updating descriptions or attaching glossary phrases, the combination detects these adjustments and selectively pushes them again to SageMaker Unified Studio.

The reverse sync management aircraft is a pipeline that robotically detects adjustments made to belongings after which triggers SageMaker Unified Studio Replace API calls within the background to maintain every thing synchronized.

What’s subsequent?

Part 1 delivers core metadata synchronization and principal catalog choice for instant consistency throughout your knowledge governance platforms. Part 2 will synchronize lineage and knowledge high quality, so groups see the identical knowledge flows and high quality alerts in each Atlan and SageMaker Catalog, enabling end-to-end visibility into how knowledge strikes via your pipelines and sustaining high quality metrics persistently tracked throughout each techniques. Part 3 will add built-in approval workflows to streamline how entry is requested and granted throughout options, decreasing friction for knowledge customers whereas sustaining strong governance controls. These upcoming phases construct towards a completely related governance expertise, preserving metadata, lineage, high quality, and entry insurance policies aligned throughout the trendy knowledge stack.

Cleanup

In the event you not want the SageMaker Unified Studio connector integration, full the next steps to wash up your surroundings and keep away from unintended useful resource utilization:

  1. Delete the CloudFormation stack. Navigate to the AWS CloudFormation console, find the stack deployed for this answer, and select Delete. This motion removes the AWS sources provisioned by the stack, together with IAM roles, insurance policies, and supporting elements.
  2. Take away the connection in Atlan. Go to Delete a connection to comply with the steps outlined in Atlan’s documentation to delete the related connection.

Cleansing up these elements retains your AWS and Atlan environments streamlined, safe, and cost-efficient.

Conclusion

On this submit, you realized set up a bidirectional integration between Atlan and Amazon SageMaker Unified Studio that unifies metadata governance throughout your knowledge and AI environments. You walked via deploying the mandatory AWS infrastructure utilizing CloudFormation, configuring the safe IAM primarily based connection, and establishing bidirectional synchronization to maintain glossary phrases, descriptions, and governance context aligned throughout each platforms.

Organizations can use this integration to attach enterprise and technical customers inside a single governance framework, making a constant, trusted view of information throughout the enterprise. With one safe configuration, groups can synchronize metadata between Atlan and Amazon SageMaker Unified Studio, establishing a dependable basis for innovation, collaboration, and accountable AI at scale.


In regards to the authors

Karan Singh Thakur

Karan is a Senior Product Supervisor at Atlan, main the technique and execution for deep hyperscaler integrations, particularly throughout AWS. Earlier than Atlan, Karan spent over a decade constructing cloud-based, data-intensive environments, together with serving because the founding PM for a completely managed lakehouse engine and main enterprise analytics, governance, and Kubernetes-based workload techniques.

Satabrata Paul

Satabrata Paul

Satabrata is a Senior Software program Engineer on Atlan’s Metadata Market crew, the place he designs and scales backend techniques and CI/CD workflows for high-quality metadata connector integrations. Targeted on trendy knowledge environments, he helps groups streamline asset discovery, lineage, and cataloging throughout advanced environments.

Divij Bhatia

Divij Bhatia

Divij is a Software program Growth Engineer at Amazon Net Providers (AWS). He’s obsessed with constructing resilient and scalable cloud-based options that resolve real-world issues for patrons. His free time typically takes him outside, touring and capturing landscapes.

Leonardo Gomez

Leonardo Gomez

Leonardo is a Principal Analytics Specialist Options Architect at Amazon Net Providers (AWS). He has over a decade of expertise in knowledge administration, serving to clients across the globe handle their enterprise and technical wants.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

0FansLike
0FollowersFollow
0SubscribersSubscribe
- Advertisement -spot_img

Latest Articles