5.0 - Vault Architecture¶
5.01 - Vault for Production Environments¶
Overview¶
- Up until this point, Vault has been used in development mode only.
- In this, all data is stored in-memory ONLY.
- This is obviously not suitable for production, a new storage class is required. Vault supports a multitude of storage class options, such as:
- Filesystem
- S3 Bucket (AWS)
- Databases (MySQL, PostgreSQL, etc.)
Deploying in Production Mode¶
- Vault servers are configured using a particular file in JSON or HCL.
- Example config:
storage "file" {
path = "/root/vault-data"
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_disable = 1
}
- Once setup, the Vault can be started using the config file:
vault server -config /path/to file
- As we're using a new backend, once the Vault server is started, it must be initialized.
- In this step, encryption keys are generated, as well as unseal keys and the initial root token.
- To do so:
vault operator init
- This will output the Unseal keys and Root Tokens.
- Once initialized, the Vault must be unsealed. This is true of every initialized Vault.
- Unsealing is required as Vault knows where to look in the particular storage backend, but it does not have the key to decrypting it and reading the data.
- To unseal the Vault:
vault operator unseal
Practical Example¶
- For any VMs used, ensure that SSH port 22 is open for ease of use.
-
This example will be set up for Linux.
-
Creating the config file. The following areas should be added:
- Storage (required)
- Listener (required)
- Telemetry (optional)
```go storage "file" { path = "/path/to/vault/data" # path will be created if it doesn't exist! }
listener "tcp" { address = "0.0.0.0:8200" tls_disable = 1 # shouldn't be disabled when running in production } ```
-
Starting the server:
vault server -config /path/to/config.hcl
- Initialize the Vault:
- Ensure
VAULT_ADDR
environment variable is set - Verify vault operations by running
vault status
- Run
vault operator init
- Save the unseal keys and root tokens - they will disappear and be inaccessible after this!
- Ensure
- Unseal the Vault:
vault operator unseal
- Provide 1 of the 5 unseal keys provided
- Repeat two more times such that 3 of the 5 unseal keys have been provided.
- You can test Vault operations by using e.g.
vault list auth/token/accessors
- This will fail though, as Vault in Production mode by default does not set the root token as the token for usage.
- Login via root
vault login
and provide the root token- Future requests will automatically use the token.
- The same will be applied for any other tokens created (so long as the user has sufficient permission!)
Verifying Storage Persistence¶
- As production mode persists storage, if Vault is shut down, the data isn't deleted.
-
Upon restart of Vault, the process outlined previously must be followed to unseal the vault and verify data persistence.
-
Note: THE UNSEAL KEYS SHOULD NOT BE STORED WITH ONLY 1 PERSON - THEY SHOULD BE DISTRIBUTED AMONGST TEAM MEMBERS
5.02 - Vault UI for Production¶
- By default, the Vault GUI is disabled for production. It can be enabled via the Vault config file however.
- All that is needed is adding the following to the config file:
ui = true # boolean
- Note: For production environments, you may wish to update the private address of the server under
listener
. - You can verify the system is listening to the desired address using
netstat -ntlp
-
Once the Vault is unsealed, ensure that the desired port for HashiCorp Vault is open - in this case
8200
, to allow the user to access the Vault UI. -
Note: When accessing the UI whilst the Vault is in a sealed state, users are able to provide the unseal keys to unseal the vault.
5.03 - Understanding Vault Agent¶
Challenges¶
- For applications needing to interact with Vault, they must first authenticate to Vault and then use the tokens for their required tasks.
- Outside of this, logic related to token renewal, etc may be required with the application.
- Rather than building custom logic for the application(s), the Vault agent can be utilised.
Vault Agent Overview¶
- A client daemon that automates the workflow of client login and token refresh.
- It automatically authenticates to Vault for supported auth methods.
- It ensures tokens are renewed by re-authenticating as required, until renewal is no longer allowed.
- Additionally, it is designed with robustness and fault-tolerance in mind.
The Vault agent works as follows:
- Authenticates and acquires a token via the configured auth method
- The token is written to the backend
- The token is used to authenticate to Vault
Running Vault Agent¶
- To use the Vault agent, the binary can be ran in "agent mode"
- To do so, run
vault agent config=<config file>
- The agent configuration file must specify the auth method and sink locations where the tokens are to be written.
Working of Vault Agent¶
- When Vault is started in agent mode, it will attempt to get a Vault token via the auth method specified in the agent config file.
- Upon successful authentication, the token is written to the sink locations.
- Whenever the current token's value changes; the agent writes to the sinks.
Vault Agent Example¶
- Various Auth methods are available, including AppRole, Azure, and AWS.
- Ensure the AppRole Auth method is enabled either via the CLI or UI
vault auth enable approle
- Create a policy for the agent. An example follows:
path "auth/token/create" {
capabilities = ["update"]
}
- Create an approle to use the policy defined above:
vault write auth/approle/role/<role name> token_policies="<policy name>"
- Fetch the role ID:
vault read auth/approle/role/<role name>/role-id
- Obtain the secret ID:
vault read auth/approle/role/<role name>/secret-id
- Add the following code (example) at minimum to the agent config file (written in HCL):
exit_after_auth = false
pid_file = "./pidfile"
auto_auth {
method "approle" {
mount_path = "auth/approle"
config = {
role_id_file_path = "/path/to/role-id" # files must exist!
secret_id_file _path = "/path/to/secret-id" # files must exist!
remove_secret_id_file_after_reading = false
}
}
sink "file" {
config = {
path = "/path/to/token"
}
}
vault {
address = "http://127.0.0.1:8200
}
- Start the Vault in agent mode via
vault agent -config=/path/to/file.hcl
- Information provided regarding the sink file with the token
- This can be looked up via
vault token lookup <token>
- Now, any application wishing to use the token must fetch it from the sink file location. As an example, a Kubernetes secret could be based on this.
5.04 - Vault Agent Caching¶
Primary Functionalities of Vault Agent¶
- Auto-Auth
- Automatically authenticates to Vault and manages the token removal process
- Caching
- Allows client-side caching of responses containing newly-created tokens.
Overview of Caching¶
-
Vault Caching follows a particular general process:
-
Application/Service requests a lease or token from Vault Agent
- Vault Agent Response: Upon Receiving the Request, Vault checks its Cache
- If the requested lease or token is in cache, it is returned to the app/service
- If not found in the cache, a request is forwarded to Vault Server to obtain the lease or token
- If step 2b occurred, Vault Server returns the requested lease or token
- The returned lease or token is stored in the Agent Cache of the Vault client; then returned to the application or service for usage.
In-Practice¶
- When looking to utilise the caching mechanism of the Vault agent, two particular areas must be added to the config:
cache {
use_auto_auth_token = true
}
listener "tcp" {
address = "127.0.0.1:8007"
tls_disable = true
}
- Starting the vault using the config:
vault agent -config=/path/to/config.hcl
- Run a test command:
<token> vault token create
- No logs will appear in the agent logs as this request goes directly to the Vault Server
- To resolve,
export VAULT_AGENT_ADDR=127.0.0.1:8007
- If the request is made again, the same token will be returned rather than a new one generated - this is because the original token was stored in the cache of the agent.
- In separate users, so long as they have the Vault Agent Address variable set, they will not require a token to run commands as the client token is used.
5.05 - Shamirs Secret for Unsealing Process¶
- Storage backend in Vault is considered to be untrusted.
- Any data stored is within the encrypted state.
- When vault is initialized, it generates an encryption key to protect all data stored within.
- This key is then protected by a master key
- Vault then uses Shamir's secret sharing algorithm to split the master key into 5 separate shares, where any 3 of which can be used to reconstruct the master key.
- This is why 3 keys must be used to unseal the vault.
Notes¶
- The number of shares and minimum threshold required can both be specified.
- Shamirs technique can be disabled such that the master key can be used directly for unsealing.
- Once Vault receives the encryption key, it can decrypt the data in the storage backend and the Vault becomes unsealed
Seal Stanza¶
- In the Vault server config file, the particular seal type can be configured if usage of Shamir's technique is not desired.
seal [name] {
# seal details
}
- This is optional configuration - by default Shamir's algorithm will be used if not configured.
- Most people tend to use this by default.
- Alternatives could include HSM or Cloud Key Management Solutions e.g. AWS KMS.
5.06 - Vault Auto-Unseal Overview¶
Background¶
- During Vault Unseal process, users must enter they key(s) required.
- This allows each share of the master key to be stored separately for improved security.
- This can lead to issues:
- If there are multiple Vault clusters in an organization, unsealing them can be a tiresome process heavily dependent upon the availability of users with the keys.
- Unsealing makes the process of automating a Vault install difficult.
- Automation tools can easily install, configure and start Vault, but unsealing it via Shamir's technique is heavily manual.
- To resolve these issues, Vault's Auto-Unseal mechanism can be utilised.
Auto-Unseal Overview¶
- This delegates the responsibility of securing the unseal key from users to a trusted device or service.
- At startup, Vault will connect to the device or service implementing the seal, then request it decrypt the master key, such that Vault can read from storage.
- Cloud-based key(s) provides master key information → Provided master key used to access encryption key and decrypt.
- This provides pre-setup via the desired unseal method e.g.:
- AWS KMS
- Transit Secret Engine
- Azure Key Vault
- HSM
- GCP Cloud KMS
5.07 - Auto-Unseal with AWS KMS¶
Setting up AWS Auto-Unseal From Scratch¶
- As a prerequisite when implementing auto-unseal, you should have an understanding of the mechanism to be used.
- From Services → Key Management Service (KMS)
- Create a symmetric key
- Leave all settings as default (unless otherwise)
- Create an IAM user for authentication.
- Provide sufficient permissions (Admin for demo purposes)
- Note the access and secret keys associated with the IAM users.
- Typically, one would export them as environment variables for security purposes.
- Add the following to the Vault Server Config file:
seal "awskms" {
region = "<region code>"
access_key = "<access key>" # typically included as env variable instead
secret_key = "<secret key>" # typically included as env variable instead
kms_key_id = "<kms key id>"
endpoint = "<KMS API endpoint>" # optional - typically used for private connactions over vpc from an EC2 instance
}
- Starting the Vault using this config:
vault server -config <config>.hcl
- Upon first starting, an error/warning will be provided saying that no keys were found. Vault must be initialized to provide the master key and key shares.
vault operator init
- Provides the root token and recovery keys
- This will automatically unseal the vault.
- The keys generated are automatically stored into the kms - they can now be called automatically for future usage of Vault.
5.08 - Vault Plugin Mechanism¶
- Plugins are the building blocks of Vault. All auth and secret backends are considered plugins.
- This allows Vault to extensible to suit user needs.
- This doesn't just apply to pre-existing plugins, custom plugins can also be developed.
- In general, to integrate a 3rd-party plugin, the following steps need to be done:
- Create the plugin
- Compile the plugin
- Start Vault pointing towards where the plugin is stored.
In Practice¶
- Install any desired packages e.g. go, git, etc. (Use Apt, Yum, etc as appropriate)
- Clone the code down for the plugin using GIT e.g. vault-guides/secrets/mock
- Compile the plugin using go:
go build -o /path/to/output/dir
- Start Vault, pointing to the plugin directory:
vault server -dev -dev-root-token-id=root -dev-plugin-dir=/path/to/dir
- Test the plugin functionality:
export VAULT_ADDR="http://127.0.0.1:8200
vault login root
vault secrets enable my-mock-plugin
vault secrets list
- Verify the plugin is working and the secret path exists.
- Read and Write operations to the plugin path:
vault write my-mock-plugin/test message="Hello World"
vault read my-mock-plugin/test
Production¶
- In production, to use plugins, the following should be added to the config file:
plugin_directory = /path/to/directory"
5.09 - Audit Devices¶
- Components in Vault that log requests and responses to Vault.
- As each operation is an API request/response - the audit log includes every authenticated interaction with Vault; including errors.
- By default, auditing is not enabled. It must be enabled by a root user using
vault audit enable
.
Audit Devices Example¶
- To enable file audit:
vault audit enable file file_path=path/to/file.log
- Audit mechanisms in place can be viewed by
vault audit list
- Note: The audit log is by default in JSON format.
- Details logged include:
- Client token used
- Accessor
- User
- Policies involved
- Token type.
- Note: Other options are available:
- Syslog
- Socket
- Enabling in Linux:
vault audit enable -path="audit_path" file file_path=/var/log/vault-audit.log
- Audit log stored at /var/log
- viewable via
cat <audit log> | jq
- Note: The client token value is hashed - this can be computed via the endpoint
/sys/audit-hash
- Sample request:
-
vault print token
- gets the original token in plaintextgo curl --header "X-Vault-Token: <vault-token>" --request POST --data @audit.json http://127.0.0.1:8200/v1/sys/audit-hash/<path>
-
Important Pointers¶
- If there are any audit devices enabled, Vault requires at least one to persist the logs before completing a request.
- If only one device is enabled and is blocking; Vault will be unresponsive until the audit device can write.
- If more than one audit device is enabled in addition to the blocking one, Vault will be unaffected.
5.11 - Vault Enterprise Overview¶
- Includes a number of features benefitting organizational workflows.
- Features include:
- Disaster Recovery
- Namespaces
- Monitoring
- Multi-Factor Authentication
- Auto-Unseal with HSM
- Once Vault Enterprise is acquired - instructions are provided for installation.
- In terms of appearance, there is little difference between Enterprise and Open-Source versions.
5.12 - Vault Namespaces¶
Background Context¶
- In organizations, there are typically multiple teams wanting to use services such as Vault for their own particular purposes.
- In the case of Vault, they would need to manage their own secrets, auth methods, etc.
- To allow for this segregation and minimizing impacts teams may have on one another, Namespaces can be used.
- Each Vault namespace can have its own:
- Policies
- Auth Methods
- Secrets Engines
- Tokens
- Identity entities and groups.
Namespaces Example¶
- When logging into Vault, you can specify the namespace to log into if desired.
- Namespaces are created under
Access
in the Vault GUI.
5.13 - Vault Replication¶
- Having a single Vault cluster can impose various challenges, such as:
- High latency
- Connection issues
- Availability loss
- It is therefore beneficial to have multiple clusters across different regions; allowing users in region 1 to manage their own secrets, etc. in their closer region.
- Replication serves to resolve this. There are multiple types available.
Performance Replication¶
- Secondary regions keep track of their own tokens and leases, but share the same underlying configuration, policies, and supporting secrets (K/V values, encryption keys for transit, etc.) with the primary region.
Disaster Recovery Replication¶
- Allows for a full restoration of all types of data (including local and cluster data)
- Service tokens and leases are valid across both clusters.
- The secondary cluster does not handle any client requests, and can be promoted to the new primary in the event of disaster.
5.14 - Monitoring Telemetry in Vault¶
- Telemetry covers any data being output by a particular device.
- Analysis of this can be useful for areas such as:
- Performance
- Troubleshooting
- In Vault, there are two data sets to note:
- Metrics - Output via Telegraf
- Vault Audit Logs - Output via Fluentd
- These metrics can be output to various tools such as Prometheus.
Metrics Output¶
- Further details for each metric are available in the documentation, however metrics covered include:
- Policy-based
- Token-based
- Resource usage
Configuration¶
- In the Vault Server config, one can add a config block to the file to point to a particular location for the metrics to be exported to.
telemetry {
statsite_address = "statsite.company.local:8125"
}
- Metrics can also be fetched from the
/sys/metrics
endpoint - Sample curl requests are available in the documentation
- The format of the metrics can be configured e.g. JSON, Prometheus
- e.g.
curl --header "X-Vault-Token: <token>" 127.0.0.8200:/v1/sys/metrics
- Additional documentation is available for configuring Vault with various monitoring integrations; including Splunk.
5.15 - High-Availability Setup & Implementation in Vault¶
Overview of HA¶
- In general, running a single instance of anything is risky. Vault is no exception.
- Vault supports a multi-server mode for High-Availability. This further protects organisations against outages by running multiple servers.
-
The data will be replicated across each based on the "leader".
-
Note: High-Availability IS STORAGE BACKEND DEPENDENT - Integrated Storage is also available to support this.
-
Out-of the box, Raft is available for use, an integrated storage backend.
- To configure:
storage "raft" {
path = "/path/to/raft/data" # defines where the data will be stored
node_id = "raft_node_1"
}
cluster_addr = "http://127.0.0.1:8201"
High Availability Example¶
- Start three separate instances of Vault with the above config:
vault server -config=/path/to/config.hcl
- List the raft peers:
vault operator raft list-peers
- One will be noted as "leader" under state
- Run a test command in the leader node to store data:
vault kv put secret/dbcreds admin=password
- On one of the follower nodes, test access to the secrets:
vault kv get secret/dbcreds
- The secret data should be displayed.
- In the event that the leader node goes down, the data will still be accessible AND a new leader node will be assigned.
Implementing Vault HA¶
- Raft Storage can be configured by adding the following to the config file:
storage "raft" {
path = "/path/to/raft/data" # defines where the data will be stored
node_id = "raft_node_1"
}
cluster_addr = "http://127.0.0.1:8201"
- You are advised to set
disable_mlock
to true and disable memory swapping on the system. - Start the server with
vault server -config=/path/to/config
- The vault will be shown to operational with Raft storage ready
- In a separate terminal, initialise the Vault:
vault operator init -key-shares=<value> -key-threshold=<value> > key.txt
- Unseal:
vault operator unseal <unseal key>
- Login:
vault login <token>
- Check the Raft nodes:
vault operator raft list-peers
- Repeat for each node/vault server as required up until initialisations:
export VAULT_ADDR='address'
- Join the node to the server
vault operator raft join <leader IP address>
- To verify, put a secret to a particular path e.g.:
vault secrets enable -path=secret kv
vault kv put secret/creds admin=password
- On another node:
vault kv get secret/creds
- should return the secret. -
Test the replication and leadership transfer by restarting the leader node and running
vault operator raft list-peers
-
To be highly available, one of the Vault server nodes grabs a lock within the data store.
- The successful server node becomes the "active" node - all others become standby nodes.
- At this point, if the standby nodes receive a request, they will either forward the request or redirect the client depending on the configuration.
- Nodes can be stepped down from active duty by using the
vault operator step-down <address>
command.
5.16 - Raft Storage - Snapshot and Restore¶
Snapshot and restore operations can be carried out via raft for the following commands:
vault operator raft snapshot save <file>.snap
vault operator raft snapshot restore <file>.snap