U:: [[Big Data]] ## Overview - Use cases: - collecting - cleansing - moving - cataloging - Permissions model augments IAM permissions - Allows fine-gained access via grants and revokes like a regular database (column, row, cell-level access) - Works for Athena, Quicksight, Redshift - Data ingestion and management - Import from databases in AWS like MySQL, PostgreSQL, SQL Server, MariaDB, Oracle in RDS - bulk and incremental supported - Connect to any database via JDBC (via AWS Glue) - Security management - Access controls for data - Database, table, column, row, and cell level - Column is useful for PII - IAM users and roles, and those federating through an external identity provider - Can use within Redshift Spectrum, Athena, Glue ETL, EMR (Spark) - Audit logging - CloudTrail logging for access and compliance with policies - Tag based access control for managing permissions to databases, tables, and columns - replace policy definitions of thousands of resources with a few logical tags - Cross account access - Share with an IAM principal in an external account, so that they can use their own IAM permissions to access the resources - Data sharing - Data sharing with redshift - Data exchange ## Permissions Management - 2 levels - metadata (databases and tables in the catalog) - storage permissions (s3) - Administrator does - registration of s3 location - grants - Permissions - Get metadata - Coordinates the analytics engine, data catalog, and storage - Credential Vending - Has to be trusted to enforce lake formation permissions - Storage API - enforce lake formation API on custom and third-party apps ## Querying - Governed tables - Allows ACID transactions - Create, update, delete - Automatic compaction of partitions - Time travel