Glue

aws/analytics aws/serverless aws/service

💡 Definition

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. It is primarily an ETL (Extract, Transform, Load) service.

🔑 Key Concepts

⚙️ How it Works

  1. Crawler: Scans your data (e.g., in S3) and creates table definitions in the Data Catalog.
  2. Job: You write a script (Python/Scala) or use the visual editor to define transformations.
  3. Trigger: Run the job on a schedule or event to move data to a destination (e.g., Redshift).

🎯 Use Cases

💰 Pricing Model

📝 Exam Tips (CLF-C02)


See Also: * Athena * Redshift * EMR