- Explain your current project?
- How to create triggers in aws lambda?
- Explain aws glue architecture?
- What is schema registry in aws Glue?
- How metata data is stored in glue catalog?
- Assume you are getting data from two different sources like DynamoDB and RDBMS, How Schema is managed in Glue Catalog?
- Design a solution using aws services, when a glue job is processed stakeholders should be notified with metrics like number of files processed, number of success, failures and size of each file?
- How to determine size of a file in aws S3 using aws glue and delete the files where size is greater than 100KB?
- Assume you are writing data to a sink, While writing data few corrupted records are missing. How to store corrupted records in a separete table using spark?
- What is the difference between spark RDD and Dataframe?
- What is repartition in Spark?
- What are spark deploy modes?
- What is the role of Map function in Spark. Explain with an example?
- Suppose you are reading a huge CSV file into a spark Dataframe, There will be shuffle in partitions. How to get data evenly in all partitions?
- How to perform union of two dataframes when the schema of dataframes is dynamically changes?
- How to flatten json in snowflake?
- Write a python program to capitalize all first letter of each word in a string?
- Write a python program to get 3rd least value in a list?
- What is row number in sql?
- How the data is stored in a data warehouse ? is it normalized or de-normalized?
0 Comments