How to configure credentials
The Great Expectations CLI is no longer the preferred method for implementing and configuring Great Expectations. This topic will be updated soon to reflect this change. For more information, see A fond farewell to the CLI.
This guide will explain how to configure your great_expectations.yml
project config to populate credentials from either a YAML file or a secret manager.
If your Great Expectations deployment is in an environment without a file system, refer to How to instantiate an Ephemeral Data Context.
- YAML
- Secret Manager
Prerequisites
- Completion of the Quickstart guide.
- A working installation of Great Expectations.
Steps
1. Save credentials and config
Decide where you would like to save the desired credentials or config values - in a YAML file, environment variables, or a combination - then save the values.
In most cases, we suggest using a config variables YAML file. YAML files make variables more visible, easily editable, and allow for modularization (e.g. one file for dev, another for prod).
- In the
great_expectations.yml
config file, environment variables take precedence over variables defined in a config variables YAML - Environment variable substitution is supported in both the
great_expectations.yml
and config variablesconfig_variables.yml
config file.
If using a YAML file, save desired credentials or config values to great_expectations/uncommitted/config_variables.yml
or another YAML file of your choosing:
my_postgres_db_yaml_creds:
drivername: postgresql
host: localhost
port: 5432
username: postgres
password: ${MY_DB_PW}
database: postgres
- If you wish to store values that include the dollar sign character
$
, please escape them using a backslash\
so substitution is not attempted. For example in the above example for Postgres credentials you could setpassword: pa\$sword
if your password ispa$sword
. Say that 5 times fast, and also please choose a more secure password! - When you save values via the CLICommand Line Interface, they are automatically escaped if they contain the
$
character. - You can also have multiple substitutions for the same item, e.g.
database_string: ${USER}:${PASSWORD}@${HOST}:${PORT}/${DATABASE}
If using environment variables, set values by entering export ENV_VAR_NAME=env_var_value
in the terminal or adding the commands to your ~/.bashrc
file:
export POSTGRES_DRIVERNAME=postgresql
export POSTGRES_HOST=localhost
export POSTGRES_PORT=5432
export POSTGRES_USERNAME=postgres
export POSTGRES_PW=
export POSTGRES_DB=postgres
export MY_DB_PW=password
2. Set config_variables_file_path
If using a YAML file, set the config_variables_file_path
key in your great_expectations.yml
or leave the default.
config_variables_file_path: uncommitted/config_variables.yml
3. Replace credentials with placeholders
Replace credentials or other values in your great_expectations.yml
with ${}
-wrapped variable names (i.e. ${ENVIRONMENT_VARIABLE}
or ${YAML_KEY}
).
datasources:
my_postgres_db:
class_name: Datasource
module_name: great_expectations.datasource
execution_engine:
module_name: great_expectations.execution_engine
class_name: SqlAlchemyExecutionEngine
credentials: ${my_postgres_db_yaml_creds}
data_connectors:
default_inferred_data_connector_name:
class_name: InferredAssetSqlDataConnector
my_other_postgres_db:
class_name: Datasource
module_name: great_expectations.datasource
execution_engine:
module_name: great_expectations.execution_engine
class_name: SqlAlchemyExecutionEngine
credentials:
drivername: ${POSTGRES_DRIVERNAME}
host: ${POSTGRES_HOST}
port: ${POSTGRES_PORT}
username: ${POSTGRES_USERNAME}
password: ${POSTGRES_PW}
database: ${POSTGRES_DB}
data_connectors:
default_inferred_data_connector_name:
class_name: InferredAssetSqlDataConnector
Additional Notes
- The default
config_variables.yml
file located atgreat_expectations/uncommitted/config_variables.yml
applies to deployments created usinggreat_expectations init
. - To view the full script used in this page, see it on GitHub: how_to_configure_credentials.py
Choose which secret manager you are using:
- AWS Secrets Manager
- GCP Secret Manager
- Azure Key Vault
This guide will explain how to configure your great_expectations.yml
project config to substitute variables from AWS Secrets Manager.
- Completion of the Quickstart guide.
- A working installation of Great Expectations.
- Configured a secret manager and secrets in the cloud with AWS Secrets Manager
Secrets store substitution uses the configurations from your great_expectations.yml
project config after all other types of substitution are applied (from environment variables or from the config_variables.yml
config file)
The secrets store substitution works based on keywords. It tries to retrieve secrets from the secrets store for the following values :
- AWS: values starting with
secret|arn:aws:secretsmanager
if the values you provide don't match with the keywords above, the values won't be substituted.
Setup
To use AWS Secrets Manager, you may need to install the great_expectations
package with its aws_secrets
extra requirement:
pip install 'great_expectations[aws_secrets]'
In order to substitute your value by a secret in AWS Secrets Manager, you need to provide an arn of the secret like this one:
secret|arn:aws:secretsmanager:123456789012:secret:my_secret-1zAyu6
The last 7 characters of the arn are automatically generated by AWS and are not mandatory to retrieve the secret, thus secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret
will retrieve the same secret.
You will get the latest version of the secret by default.
You can get a specific version of the secret you want to retrieve by specifying its version UUID like this: secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret:00000000-0000-0000-0000-000000000000
If your secret value is a JSON string, you can retrieve a specific value like this:
secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret|key
Or like this:
secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret:00000000-0000-0000-0000-000000000000|key
Example great_expectations.yml:
datasources:
dev_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|drivername
host: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|host
port: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|port
username: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|username
password: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|password
database: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|database
prod_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_HOST
port: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_PORT
username: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_USERNAME
password: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_PASSWORD
database: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_DATABASE
This guide will explain how to configure your great_expectations.yml
project config to substitute variables from GCP Secrets Manager.
- Completion of the Quickstart guide.
- A working installation of Great Expectations.
- Configured a secret manager and secrets in the cloud with GCP Secret Manager
Secrets store substitution uses the configurations from your great_expectations.yml
project config after all other types of substitution are applied (from environment variables or from the config_variables.yml
config file)
The secrets store substitution works based on keywords. It tries to retrieve secrets from the secrets store for the following values :
- GCP: values matching the following regex
^secret\|projects\/[a-z0-9\_\-]{6,30}\/secrets
if the values you provide don't match with the keywords above, the values won't be substituted.
Setup
To use GCP Secret Manager, you may need to install the great_expectations
package with its gcp
extra requirement:
pip install 'great_expectations[gcp]'
In order to substitute your value by a secret in GCP Secret Manager, you need to provide a name of the secret like this one:
secret|projects/project_id/secrets/my_secret
You will get the latest version of the secret by default.
You can get a specific version of the secret you want to retrieve by specifying its version id like this: secret|projects/project_id/secrets/my_secret/versions/1
If your secret value is a JSON string, you can retrieve a specific value like this:
secret|projects/project_id/secrets/my_secret|key
Or like this:
secret|projects/project_id/secrets/my_secret/versions/1|key
Example great_expectations.yml:
datasources:
dev_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|drivername
host: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|host
port: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|port
username: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|username
password: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|password
database: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|database
prod_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_HOST
port: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PORT
username: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_USERNAME
password: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PASSWORD
database: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DATABASE
This guide will explain how to configure your great_expectations.yml
project config to substitute variables from Azure Key Vault.
- Completion of the Quickstart guide.
- A working installation of Great Expectations.
- Set up a working deployment of Great Expectations
- Configured a secret manager and secrets in the cloud with Azure Key Vault
Secrets store substitution uses the configurations from your great_expectations.yml
project config after all other types of substitution are applied (from environment variables or from the config_variables.yml
config file)
The secrets store substitution works based on keywords. It tries to retrieve secrets from the secrets store for the following values :
- Azure : values matching the following regex
^secret\|https:\/\/[a-zA-Z0-9\-]{3,24}\.vault\.azure\.net
if the values you provide don't match with the keywords above, the values won't be substituted.
Setup
To use Azure Key Vault, you may need to install the great_expectations
package with its azure_secrets
extra requirement:
pip install 'great_expectations[azure_secrets]'
In order to substitute your value by a secret in Azure Key Vault, you need to provide a name of the secret like this one:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret
You will get the latest version of the secret by default.
You can get a specific version of the secret you want to retrieve by specifying its version id (32 lowercase alphanumeric characters) like this: secret|https://my-vault-name.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab
If your secret value is a JSON string, you can retrieve a specific value like this:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret|key
Or like this:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab|key
Example great_expectations.yml:
datasources:
dev_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|drivername
host: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|host
port: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|port
username: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|username
password: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|password
database: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|database
prod_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_HOST
port: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_PORT
username: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_USERNAME
password: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_PASSWORD
database: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_DATABASE