
Prepare for your exam certification with our DEA-C01 Certified Snowflake
Free Snowflake DEA-C01 Exam 2023 Practice Materials Collection
NEW QUESTION # 34
Select the Correct statements with regard to using Federated authentication/SSO?
- A. Snowflake supports using SSO with organizations, and you can use the corresponding URL in the SAML2 security integration.
- B. Snowflake supports using MFA in conjunction with SSO to provide additional levels of security.
- C. Snowflake supports SSO with Private Connectivity to the Snowflake Service for Snow-flake accounts on Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform.
- D. Snowflake supports multiple audience values (i.e. Audience or Audience Restriction Fields) in the SAML 2.0 assertion from the identity provider to Snowflake.
Answer: A,B,C,D
NEW QUESTION # 35
Data Engineer is looking out to delete staged files automatically/periodically when the data is suc-cessfully loaded into tables by the Snowpipe. For achieving the same, which options/command is best suited: [Select 2]
- A. PURGE option can be set as True in the COPY INTO Command embedded in PIPE objects definition.
- B. To remove staged files that no longer needed, periodically DELETE command can be executed to delete the files.
- C. REMOVE_STAGE_FILES option can be set as True in the COPY INTO Command embedded in PIPE objects definition.
- D. To remove staged files that no longer needed, periodically REMOVE command can be executed to delete the files.
Answer: A,D
Explanation:
Explanation
Deleting Staged Files After Snowpipe Loads the Data
Pipe objects do not support the PURGE copy option. Snowpipe cannot delete staged files automat-ically when the data is successfully loaded into tables.
To remove staged files that you no longer need, It is recommended to periodically executing the REMOVE command to delete the files.
Alternatively, configure any lifecycle management features provided by cloud storage service pro-vider.
NEW QUESTION # 36
To troubleshoot data load failure in one of your Copy Statement, Data Engineer have Executed a COPY statement with the VALIDATION_MODE copy option set to RETURN_ALL_ERRORS with reference to the set of files he had attempted to load. Which below function can facilitate analysis of the problematic records on top of the Results produced? [Select 2]
- A. LOAD_ERROR
- B. LAST_QUERY_ID
- C. RESULT_SCAN
- D. Rejected_record
Answer: B,C
Explanation:
Explanation
LAST_QUERY_ID() Function
Returns the ID of a specified query in the current session. If no query is specified, the most recently executed query is returned.
RESULT_SCAN() Function
Returns the result set of a previous command (within 24 hours of when you executed the query) as if the result was a table.
The following example validates a set of files (SFfile.csv.gz) that contain errors. To facilitate analy-sis of the errors, a COPY INTO <location> statement then unloads the problematic records into a text file so they could be analyzed and fixed in the original data files. The statement queries the RESULT_SCAN table.
1.#copy into Snowtable
2.from @SFstage/SFfile.csv.gz
3.validation_mode=return_all_errors;
4.#set qid=last_query_id();
5.#copy into @SFstage/errors/load_errors.txt from (select rejected_record from ta-ble(result_scan($qid))); Note: Other options are not valid functions.
NEW QUESTION # 37
Snowflake supports using key pair authentication for enhanced authentication security as an alterna-tive to basic authentication (i.e. username and password). Select the list of SnowFlake Clients sup-port the same?
[Select All that Apply]
- A. Node.js
- B. SnowCD
- C. SnowFlake Connector for Spark
- D. Go Driver
- E. SnowSQL
Answer: A,C,D,E
NEW QUESTION # 38
Data Engineer is performing below steps in sequence while working on Stream s1 created on table t1.
Step 1: Begin transaction.
Step 2: Query stream s1 on table t1.
Step 3: Update rows in table t1.
Step 4: Query stream s1.
Step 5: Commit transaction.
Step 6: Begin transaction.
Step 7: Query stream s1.
Mark the Incorrect Operational statements:
- A. For Step 4, Returns the CDC data records by streams with updated rows happened in the Step 3 because Streams works in Repeated committed mode in which statements see any changes made by previous statements executed within the same transaction, even though those changes are not yet committed.
- B. For Step 5, If the stream was consumed in DML statements within the transaction, the stream position advances to the transaction start time.
- C. For Step 7, Results do include table changes committed by Transaction 1.
- D. if Transaction 2 had begun before Transaction 1 was committed, queries to the stream would have returned a snapshot of the stream from the position of the stream to the be-ginning time of Transaction 2 and would not see any changes committed by Transac-tion 1.
- E. For Step 2, The stream returns the change data capture records between the current position to the Transaction 1 start time. If the stream is used in a DML statement, the stream is then locked to avoid changes by concurrent transactions.
Answer: A
Explanation:
Explanation
Streams support repeatable read isolation. In repeatable read mode, multiple SQL statements within a transaction see the same set of records in a stream. This differs from the read committed mode supported for tables, in which statements see any changes made by previous statements executed within the same transaction, even though those changes are not yet committed.
The delta records returned by streams in a transaction is the range from the current position of the stream until the transaction start time. The stream position advances to the transaction start time if the transaction commits; otherwise, it stays at the same position.
Within Transaction 1, all queries to stream s1 see the same set of records. DML changes to table t1 are recorded to the stream only when the transaction is committed.
In Transaction 2, queries to the stream see the changes recorded to the table in Transaction 1. Note that if Transaction 2 had begun before Transaction 1 was committed, queries to the stream would have returned a snapshot of the stream from the position of the stream to the beginning time of Transaction 2 and would not see any changes committed by Transaction 1.
NEW QUESTION # 39
Charles, A Lead Data engineer, with ACCOUNTADMIN role wants to configure the time travel for one of the Schema's object. He setup the MIN_DATA_RETENTION_TIME_IN_DAYS pa-rameter with Value 79 at account level but he figured out that DA-TA_RETENTION_TIME_IN_DAYS is already set with value 81 at account level. What would be the effective minimum data retention period for an object?
- A. 0
- B. There is no such MIN_DATA_RETENTION_TIME_IN_DAYS parameter
- C. 1
- D. 2
Answer: D
Explanation:
Explanation
A user with the ACCOUNTADMIN role can also set the MIN_DATA_RETENTION_TIME_IN_DAYS at the account level. This parameter setting enforc-es a minimum data retention period for databases, schemas, and tables. Setting MIN_DATA_RETENTION_TIME_IN_DAYS does not alter or replace the DA-TA_RETENTION_TIME_IN_DAYS parameter value. It may, however, change the effective data retention period for objects. When MIN_DATA_RETENTION_TIME_IN_DAYS is set at the ac-count level, the data retention period for an object is determined by MAX(DATA_RETENTION_TIME_IN_DAYS, MIN_DATA_RETENTION_TIME_IN_DAYS).
NEW QUESTION # 40
Mark the Correct Statements:
Statement 1. Snowflake's zero-copy cloning feature provides a convenient way to quickly take a "snapshot" of any table, schema, or database.
Statement 2. Data Engineer can use zero-copy cloning feature for creating instant backups that do not incur any additional costs (until changes are made to the cloned object).
- A. Statement 1 & 2 are correct.
- B. Statement 2
- C. Statement 1
- D. Both are False.
Answer: D
Explanation:
Explanation
Snowflake's zero-copy cloning feature provides a convenient way to quickly take a "snapshot" of any table, schema, or database and create a derived copy of that object which initially shares the underlying storage. This can be extremely useful for creating instant backups that do not incur any additional costs (until changes are made to the cloned object).
For example, when a clone is created of a table, the clone utilizes no data storage because it shares all the existing micro-partitions of the original table at the time it was cloned; however, rows can then be added, deleted, or updated in the clone independently from the original table. Each change to the clone results in new micro-partitions that are owned exclusively by the clone and are protect-ed through CDP.
NEW QUESTION # 41
As a Data Engineer, you have requirement to query most recent data from the Large Dataset that reside in the external cloud storage, how would you design your data pipelines keeping in mind fastest time to delivery?
- A. External tables with Materialized views can be created in Snowflake.
- B. Snowpipe can be leveraged with streams to load data in micro batch fashion with CDC streams that capture most recent data only.
- C. Direct Querying External tables on top of existing data stored in external cloud storage for analysis without first loading it into Snowflake.
- D. Unload data into SnowFlake Internal data storage using PUT command.
- E. Data pipelines would be created to first load data into internal stages & then into Per-manent table with SCD Type 2 transformation.
Answer: A
Explanation:
Explanation
In a typical table, the data is stored in the database; however, in an external table, the data is stored in files in an external stage. External tables store file-level metadata about the data files, such as the filename, a version identifier and related properties. This enables querying data stored in files in an external stage as if it were inside a database. External tables can access data stored in any format supported by COPY INTO <table> statements.
External tables are read-only, therefore no DML operations can be performed on them; however, external tables can be used for query and join operations. Views can be created against external ta-bles.
Querying data stored external to the database is likely to be slower than querying native database tables; however, materialized views based on external tables can improve query performance.
Creating External tables enable user for querying existing data stored in external cloud storage for analysis without first loading it into Snowflake. The source of truth for the data remains in the ex-ternal cloud storage.
Data sets materialized in Snowflake via materialized views are read-only.
This solution is especially beneficial to accounts that have a large amount of data stored in external cloud storage and only want to query a portion of the data; for example, the most recent data. Users can create materialized views on subsets of this data for improved query performance.
NEW QUESTION # 42
If the data retention period for a table is less than 90 days, and a stream has not been consumed, Snowflake temporarily extends this period to prevent it from going stale?
- A. FALSE
- B. TRUE
Answer: A
Explanation:
Explanation
If the data retention period for a table is less than 14 days, and a stream has not been consumed, Snowflake temporarily extends this period to prevent it from going stale. The period is extended to the stream's offset, up to a maximum of 14 days by default, regardless of the Snowflake edition for your account. The maximum number of days for which Snowflake can extend the data retention period is determined by the MAX_DATA_EXTENSION_TIME_IN_DAYS parameter value. When the stream is consumed, the extended data retention period is reduced to the default period for the table.
NEW QUESTION # 43
Mark the Correct Statements:
Statement 1. Enable failover for a primary database to one or more accounts in your organization using an ALTER DATABASE ... ENABLE FAILOVER TO ACCOUNTS statement.
Statement 2. Enabling failover for a primary database can be done by Data Engineer either before or after a replica of the primary database has been created in a specified account.
- A. Both are Correct.
- B. Both are False.
- C. Statement 2
- D. Statement 1
Answer: A
NEW QUESTION # 44
Mark a Data Engineer, looking to implement streams on local views & want to use change tracking metadata for one of its Data Loading use case. Please select the incorrect understanding points of Mark with respect to usage of Streams on Views?
- A. The CDC records returned when querying a stream rely on a combination of the offset stored in the stream and the change tracking metadata stored in the table.
- B. Enabling change tracking adds a pair of hidden columns to the table and begins storing change tracking metadata. The values in these hidden CDC data columns provide the input for the stream metadata columns. The columns consume a small amount of stor-age.
- C. As an alternative to streams, Snowflake supports querying change tracking metadata for views using the CHANGES clause for SELECT statements.
- D. Views with GROUP BY & LIMIT Clause are supported by Snowflake.
- E. For streams on views, change tracking must be enabled explicitly for the view and un-derlying tables to add the hidden columns to these tables.
Answer: D
Explanation:
Explanation
A stream object records data manipulation language (DML) changes made to tables, including in-serts, updates, and deletes, as well as metadata about each change, so that actions can be taken us-ing the changed data. This process is referred to as change data capture (CDC). An individual table stream tracks the changes made to rows in a source table. A table stream (also referred to as simply a "stream") makes a "change table" available of what changed, at the row level, between two transac-tional points of time in a table. This allows querying and consuming a sequence of change records in a transactional fashion.
Streams can be created to query change data on the following objects:
Standard tables, including shared tables.
Views, including secure views
Directory tables
External tables
When created, a stream logically takes an initial snapshot of every row in the source object (e.g. ta-ble, external table, or the underlying tables for a view) by initializing a point in time (called an off-set) as the current transactional version of the object. The change tracking system utilized by the stream then records information about the DML changes after this snapshot was taken. Change rec-ords provide the state of a row before and after the change. Change information mirrors the column structure of the tracked source object and includes additional metadata columns that describe each change event.
Note that a stream itself does not contain any table data. A stream only stores an offset for the source object and returns CDC records by leveraging the versioning history for the source object. When the first stream for a table is created, a pair of hidden columns are added to the source table and begin storing change tracking metadata. These columns consume a small amount of storage. The CDC records returned when querying a stream rely on a combination of the offset stored in the stream and the change tracking metadata stored in the table. Note that for streams on views, change tracking must be enabled explicitly for the view and underlying tables to add the hidden columns to these tables.
Streams on views support both local views and views shared using Snowflake Secure Data Sharing, including secure views. Currently, streams cannot track changes in materialized views.
Views with the following operations are not yet supported:
GROUP BY clauses
QUALIFY clauses
Subqueries not in the FROM clause
Correlated subqueries
LIMIT clauses
Change Tracking:
Change tracking must be enabled in the underlying tables.
Prior to creating a stream on a view, you must enable change tracking on the underlying tables for the view.
Set the CHANGE_TRACKING parameter when creating a view (using CREATE VIEW) or later (using ALTER VIEW).
As an alternative to streams, Snowflake supports querying change tracking metadata for tables or views using the CHANGES clause for SELECT statements. The CHANGES clause enables query-ing change tracking metadata between two points in time without having to create a stream with an explicit transactional offset.
NEW QUESTION # 45
Jackie, a Data engineer advised to his data team members about one of the Role highlighting fol-lows points:
1. Avoid Using the <?> Role for Automated Scripts
2. Avoid Using the <?> Role to Create Objects
Which System defined or Custom Role She is mentioning?
- A. ACCOUNTADMIN
- B. SECURITYADMIN
- C. SYSADMIN
- D. USERADMIN
- E. CUSTOM Role
Answer: A
NEW QUESTION # 46
Regular views do not cache data, and therefore cannot improve performance by caching?
- A. TRUE
- B. FALSE
Answer: A
Explanation:
Explanation
Regular views do not cache data, and therefore cannot improve performance by caching.
NEW QUESTION # 47
By default, a newly-created Custom role is not assigned to any user, nor granted to any other role?
- A. TRUE
- B. FALSE
Answer: A
NEW QUESTION # 48
Data Engineer is using existing pipe that automates data loads using event notifications, later he figured out the needs to modify pipe properties. For the same, He decided to recreate the pipe as best practice. He followed the below steps for the same.
1. Query the SYSTEM$PIPE_STATUS function and verify that the pipe execution state is RUN-NING.
2. Recreate the pipe (using CREATE OR REPLACE PIPE).
3. Query the SYSTEM$PIPE_STATUS function and verify that the pipe execution state is RUN-NING.
Which are the Missing recommended steps while Recreating Pipes for Automated Data Loads?
- A. Pause the pipe (using ALTER PIPE ... SET PIPE_EXECUTION_PAUSED = true) Pre & Post recreation & Resume after recreation (using ALTER PIPE ... SET PIPE_EXECUTION_PAUSED = false).
- B. Terminate the existing pipe (using ALTER PIPE ... SET PIPE_EXECUTION_TERMINATE = true) before recreation.
- C. Force the pipe to resume (using SYSTEM$PIPE_FORCE_RESUME).
- D. CREATE OR REPLACE PIPE command will recreate the PIPE successfully.
Answer: A
Explanation:
Explanation
Recreating a pipe (using a CREATE OR REPLACE PIPE statement) is necessary to modify most pipe properties.
Recreating Pipes for Automated Data Loads
When recreating a pipe that automates data loads using event notifications, it's recommended that Data Engineer complete the following steps:
1. Pause the pipe (using ALTER PIPE ... SET PIPE_EXECUTION_PAUSED = true).
2. Query the SYSTEM$PIPE_STATUS function and verify that the pipe execution state is PAUSED.
3. Recreate the pipe (using CREATE OR REPLACE PIPE).
4. Pause the pipe again.
5. Review the configuration steps for your cloud messaging service to ensure the settings are still accurate.
6. Query the SYSTEM$PIPE_STATUS function again and verify that the pipe execution state is RUNNING.
NEW QUESTION # 49
What are Invalid rules applicable when using stored procedure contains transaction?
- A. All Rules are Applicable.
- B. You cannot start a transaction before calling the stored procedure, then complete the transaction inside the stored procedure.
- C. If a transaction is started inside a stored procedure and is still active when the stored procedure finishes, then an error occurs, and the transaction is rolled back.
- D. A transaction inside a stored procedure can include a call to another stored procedure that contains a transaction.
- E. You cannot start a transaction inside the stored procedure, then complete the transac-tion after returning from the procedure.
- F. A transaction can be inside a stored procedure, or a stored procedure can be inside a transaction.
Answer: A
NEW QUESTION # 50
Streams cannot be created to query change data on which of the following objects? [Select All that Apply]
- A. Views, including secure views
- B. Query Log Tables
- C. External tables
- D. Standard tables, including shared tables.
- E. Directory tables
Answer: B
Explanation:
Explanation
Streams supports all the listed objects except Query Log tables.
NEW QUESTION # 51
Snowflake computes and adds partitions based on the defined partition column expressions when an external table metadata is refreshed.
What are the Correct Statements to configure Partition metadata refresh in case of External Tables?
- A. The object owner can configure the metadata to refresh automatically when new or updated data files are available in the external stage.
- B. There is nothing like adding partitions on External tables.
- C. Metadata refresh is not required as its Managed implicitly by Snowflake.
- D. By default, the metadata is refreshed automatically when the object is created.
- E. Partitions of External tables is managed by External Stage Cloud provider.
Answer: A,D
Explanation:
Explanation
Snowflake strongly recommend partitioning your external tables, which requires that your underly-ing data is organized using logical paths that include date, time, country, or similar dimensions in the path.
Partitioning divides your external table data into multiple parts using partition columns.
An external table definition can include multiple partition columns, which impose a multi-dimensional structure on the external data.
Partitions are stored in the external table metadata.
Benefits of partitioning include improved query performance.
Because the external data is partitioned into separate slices/parts, query response time is faster when processing a small part of the data instead of scanning the entire data set.
Based on your individual use cases, you can either:
Add new partitions automatically by refreshing an external table that defines an expression for each partition column.
Add new partitions manually.
Partition columns are defined when an external table is created, using the CREATE EXTERNAL TABLE ...
PARTITION BY syntax.
After an external table is created, the method by which partitions are added cannot be changed.
Partitions Added Automatically
An external table creator defines partition columns in a new external table as expressions that parse the path and/or filename information stored in the METADATA$FILENAME pseudocolumn.
A partition consists of all data files that match the path and/or filename in the expression for the partition column.
The CREATE EXTERNAL TABLE syntax for adding partitions automatically based on expres-sions is as follows:
CREATE EXTERNAL TABLE
<table_name>
( <part_col_name> <col_type> AS <part_expr> )
[ , ... ]
[ PARTITION BY ( <part_col_name> [, <part_col_name> ... ] ) ]
Snowflake computes and adds partitions based on the defined partition column expressions when an external table metadata is refreshed.
By default, the metadata is refreshed automatically when the object is created.
In addition, the object owner can configure the metadata to refresh automatically when new or up-dated data files are available in the external stage.
The owner can alternatively refresh the metadata manually by executing the ALTER EXTERNAL TABLE ...
REFRESH command.
The metadata for an external table can be refreshed automatically using the event notification ser-vice for your cloud storage service.
NEW QUESTION # 52
What are Common Query Problems a Data Engineer can identified using Query Profiler?
- A. Ineffective Data Sharing
- B. "Exploding" Joins i.e Joins resulting due to a "Cartesian product"
- C. Inefficient Pruning
- D. Queries Too Large to Fit in Memory
Answer: B,C,D
Explanation:
Explanation
"Exploding" Joins
One of the common mistakes SQL users make is joining tables without providing a join condition (resulting in a "Cartesian product"), or providing a condition where records from one table match multiple records from another table. For such queries, the Join operator produces significantly (often by orders of magnitude) more tuples than it consumes.
This can be observed by looking at the number of records produced by a Join operator in the profile interface, and typically is also reflected in Join operator consuming a lot of time.
Queries Too Large to Fit in Memory
For some operations (e.g. duplicate elimination for a huge data set), the amount of memory available for the compute resources used to execute the operation might not be sufficient to hold intermediate results. As a result, the query processing engine will start spilling the data to local disk. If the local disk space is not sufficient, the spilled data is then saved to remote disks.
This spilling can have a profound effect on query performance (especially if remote disk is used for spilling).
Spilling statistics can be checked in Query Profile Interface.
Inefficient Pruning
Snowflake collects rich statistics on data allowing it not to read unnecessary parts of a table based on the query filters. However, for this to have an effect, the data storage order needs to be correlat-ed with the query filter attributes.
The efficiency of pruning can be observed by comparing Partitions scanned and Partitions total sta-tistics in the TableScan operators. If the former is a small fraction of the latter, pruning is efficient. If not, the pruning did not have an effect.
Of course, pruning can only help for queries that actually filter out a significant amount of data. If the pruning statistics do not show data reduction, but there is a Filter operator above TableScan which filters out a number of records, this might signal that a different data organization might be beneficial for this query.
NEW QUESTION # 53
Which are supported Programming Languages for Creating UDTFs?
- A. Python
- B. Javascript
- C. Node.javascript
- D. Java
- E. Perl
Answer: A,B,D
NEW QUESTION # 54
Which connector creates the RECORD_CONTENT and RECORD_METADATA columns in the existing Snowflake table while connecting to Snowflake?
- A. Spark Connector
- B. Node.js connector
- C. Kafka Connector
- D. Python Connector
Answer: C
Explanation:
Explanation
Apache Kafka software uses a publish and subscribe model to write and read streams of records, similar to a message queue or enterprise messaging system. Kafka allows processes to read and write messages asynchronously. A subscriber does not need to be connected directly to a publisher; a pub-lisher can queue a message in Kafka for the subscriber to receive later.
An application publishes messages to a topic, and an application subscribes to a topic to receive those messages. Kafka can process, as well as transmit, messages; however, that is outside the scope of this document. Topics can be divided into partitions to increase scalability.
Kafka Connect is a framework for connecting Kafka with external systems, including databases. A Kafka Connect cluster is a separate cluster from the Kafka cluster. The Kafka Connect cluster sup-ports running and scaling out connectors (components that support reading and/or writing between external systems).
The Kafka connector is designed to run in a Kafka Connect cluster to read data from Kafka topics and write the data into Snowflake tables.
Every Snowflake table loaded by the Kafka connector has a schema consisting of two VARIANT columns:
RECORD_CONTENT. This contains the Kafka message.
RECORD_METADATA. This contains metadata about the message, for example, the topic from which the message was read.
NEW QUESTION # 55
......
Pass Snowflake DEA-C01 Actual Free Exam Q&As Updated Dump: https://www.examprepaway.com/Snowflake/braindumps.DEA-C01.ete.file.html
DEA-C01 Exam Info and Free Practice Test All-in-One Exam Guide Sep-2023: https://drive.google.com/open?id=1DtM1pBNf42c7mwnXCEtM8ZNwXwtLwTQC