# Steampipe > sidebar_label: Coding Standards --- --- title: Coding Standards sidebar_label: Coding Standards --- # Coding Standards ## Code Formatting Code should be formatted with gofmt. ## Comments - Code "sections" are broken up by comments that start with 4 slashes, so that the code editors don't fold them. The section name should be uppercase. - Example: `//// HYDRATE FUNCTIONS` - All public functions/structs etc should have a comment per the standard recommended in the go docs: > "Doc comments work best as complete sentences, which allow a wide variety of automated presentations. The first sentence should be a one-sentence summary that starts with the name being declared.": ```go // Install installs a plugin in the local file system func Install(image string) (string, error) { ... ``` - Add the package comment to the `plugin.go` file, per the go docs: > "Every package should have a package comment, a block comment preceding the package clause. For multi-file packages, the package comment only needs to be present in one file, and any one will do. The package comment should introduce the package and provide information relevant to the package as a whole. It will appear first on the godoc page and should set up the detailed documentation that follows." ## Repo Structure Each plugin should reside in a separate Github repository, named `steampipe-plugin-{plugin name}`. The repo should contain: - `README.md` - `LICENSE` - `main.go` - A folder, named for the plugin, that contains the go files, including: - `plugin.go` - A `.go` source file for each table. Go files that implement a table should be prefixed with `table_`. - Any other go files required for your plugin package - Shared functions should be added to a `utils.go` file - A `docs` folder that contains the documentation for your plugin in markdown format. These documents are used to create the online documentation at hub.steampipe.io. The folder should contain: - An `index.md` that describes the plugin, how to set it up and use it, any prerequisites, and config options. - A subfolder called `tables`. This folder should contain a file per table named `{table_name}.md` with example queries for that table. - A `config` folder that contains the default connection config file for the plugin. ### Example: Repo Structure ```bash . ├── LICENSE ├── Makefile ├── README.md ├── aws │   ├── plugin.go │   ├── service.go │   ├── table_aws_acm_certificate.go │   ├── table_aws_api_gateway_api_authorizer.go │   ├── table_aws_api_gateway_api_key.go │ ... │   └── utils.go ├── config │   └── aws.spc ├── docs │   ├── index.md │   └── tables │   ├── aws_acm_certificate.md │   ├── aws_api_gateway_api_authorizer.md │   ├── aws_api_gateway_api_key.md │ ... ├── go.mod ├── go.sum └── main.go ``` --- --- title: Developers sidebar_label: Developers --- # Steampipe Architecture ## Overview Steampipe Architecture

Steampipe uses a Postgres Foreign Data Wrapper to present data from external systems and services as database tables. The Steampipe Foreign Data Wrapper (FDW) provides a Postgres extension that allows Postgres to connect to external data in a standardized way. The Steampipe FDW does not directly interface with external systems, but instead relies on plugins to implement the API/provider specific code and return it in a standard format via gRPC. This approach simplifies extending Steampipe as the Postgres-specific logic is encapsulated in the FDW, and API and service specific code resides only in the plugin. ## Design Principles ### It should "just work" One of the goals of Steampipe since we first started envisioning it is that it should be simple to install and use - you should not need to spend hours downloading pre-requisites, fiddling with config files, setting up credentials, or pouring over documentation. We've tried very hard to bring that vision to reality, and hope that it is reflected in Steampipe as well as our plugins. When writing plugins, attempt to make it work out-of-the box as much as possible: - Use the vendor's CLI default credential mechanism and resolution order (if applicable). For example, we use the normal `aws` cli credentials for our `aws` plugin - `select * from aws_ec2_instance` works the same as `aws ec2 describe-instances`, using the AWS credentials file and/or standard environment variables. - Use sane defaults that align with the vendor's cli tool, api, or UI. Configuration options should be exactly that - *optional*. - Where possible, avoid any dependence on other 3rd party tools or libraries that are not compiled into your plugin binary. ### It should feel simple, intuitive, and familiar We chose SQL as the language for Steampipe as much for its ubiquity as its power - It was invented in 1970, and became an ANSI standard in 1986. Most developers and engineers have at least some exposure to it, and as a result can start using it right away. There are thousands of 3rd party tools that support PostgresSQL that you can just plug in. When writing plugins, strive for similar simplicity and consistency: - Follow the Steampipe standards. - Don't re-invent the wheel - use the names, terms, and values that users are already familiar with. We typically align our table and column names with the equivalent Terraform resource if one is available, and with the API naming if not. ### It should be fast (but responsible) The magic of Steampipe is that it feels like a database, yet it doesn't store any data. We put in quite a lot of effort to make it feel fast and responsive, minimizing the number of API calls based on the request, using multi-threading to parallelize requests, and streaming results. While much of this work is handled by Steampipe itself, you should endeavor to keep things tight in your plugins as well: - Don't make extraneous API calls. - Make intelligent use of caching. - Back off intelligently if you get throttled by your API. - Attempt to design tables and columns such that you do not overwhelm the service or API that you are connecting to. ### It should be clever and flexible As we first started building Steampipe, we realized we were on to something because every time someone implemented a new table, someone else came up with new ideas for how to use it. We have added features like autocomplete and `.inspect` to make it easy to discover things, flexible output formats suitable to humans and computers, and utility tables to turn complex json columns in to easy to use tables. We have a big vision for Steampipe, but we sincerely hope that our users -- ***YOU!*** -- do things with Steampipe that we haven't even dreamed of. When you write your plugin, make hard things easy, and many things possible: - Normalize complex structures, but make raw json available as well. - Build something usable and share it as soon as its MVP. Be agile - iterate! - Design for real use-cases, and imagine possibilities. --- --- title: Plugin Release Checklist sidebar_label: Plugin Release Checklist --- # Plugin Release Checklist As of June 2025, we've absorbed 149+ plugins into the Hub. If you want to contribute one -- and we hope you do! -- here are the most common things we ask contributors to check to prepare for the plugin's release. Feel free to tick the boxes as you go through the list! ## Basic Configuration Repository name The repository name should use the format `steampipe-plugin-`, e.g., `steampipe-plugin-aws`, `steampipe-plugin-googledirectory`, `steampipe-plugin-microsoft365`. The plugin name should be one word, so there are always 3 parts in the repository name. Repository topics To help with discoverability in GitHub, the repository topics should include: - postgresql - postgresql-fdw - sql - steampipe - steampipe-plugin Repository website The repository website/homepage should link to the Hub site. The URL is composed of the GitHub organization and plugin name, for instance: - https://github.com/turbot/steampipe-plugin-aws results in https://hub.steampipe.io/plugins/turbot/aws - https://github.com/francois2metz/steampipe-plugin-airtable results in https://hub.steampipe.io/plugins/francois2metz/airtable Go version The Go version in `go.mod` and any workflows is 1.24. .goreleaser.yml The `.goreleaser.yml` file uses the standard format, e.g., [AWS plugin .goreleaser.yml](https://github.com/turbot/steampipe-plugin-aws/blob/main/.goreleaser.yml). CHANGELOG A `CHANGELOG.md` is included and contains release notes for the upcoming version (typically v0.0.1). License The plugin uses the Apache License 2.0. Makefile The `Makefile` file is present and builds to the correct plugin path. ## Configuration File .spc examples The `config/PLUGIN.spc` file is neatly formatted, and explains each argument with links as appropriate, using realistic values, e.g., "xoxp-abcads…" instead of "TOKEN_HERE". Environment variables Arguments that can also be set via environment variable include the environment variable name(s) in their descriptions. ## Credentials Terraform compatibility If there's a Terraform provider for your API, the plugin supports the same credential methods as the provider. Existing CLI credentials When there are commonly used CLI credentials, like `.aws/credentials`, the plugin works with them. Expiry When credentials expire, and the API's SDK does not automatically refresh them, the plugin alerts the user and tells them how to refresh. Environment variables It's possible to set credentials using an environment variable if the API's SDK also supports using environment variables. ## Table and Column Names Standard names All table and column names follow our [Table & Column Naming Standards](https://steampipe.io/docs/develop/standards#naming). ## Table and Column Descriptions Descriptions Every table and column has a description. These are consistent across tables. Other standards All descriptions adhere to the [Table and Column Descriptions Standards](https://steampipe.io/docs/develop/standards#table-and-column-descriptions). ## Table and Column Design Global and per-authenticated-user data Many plugins can return both global data, e.g., all GitHub repos or Google Drive files, and data only for the authenticated user (my repos, my files). If that's the case, there are separate tables, e.g. `github_repository` and `github_my_repository`. Common columns If tables share columns, these are abstracted as shown in the AWS plugin's [common_columns.go](https://github.com/turbot/steampipe-plugin-aws/blob/main/aws/common_columns.go). Required configuration arguments The plugin checks required configuration arguments are set once at load time. ### Logging Error info When the plugin returns an error, it includes the location and any related args, along with the error itself. See [example](https://github.com/turbot/steampipe-plugin-linode/blob/343d38188e38e32635b1c65c3f0d69bd2d2ef87f/linode/table_linode_kubernetes_cluster.go#L46). ### Data Ingestion Default transform The plugin sets a preferred transform as the default. For example, the [GitLab plugin](https://hub.steampipe.io/plugins/theapsgroup/gitlab) uses [DefaultTransform: transform.FromGo().NullIfZero()](https://github.com/theapsgroup/steampipe-plugin-gitlab/blob/main/gitlab/plugin.go#L16). Please see [Transform Functions](https://steampipe.io/docs/develop/writing-plugins#transform-functions) for a full list of transform functions. Pagination The plugin implements pagination in each table's List function supported by the API's SDK. If pagination is implemented, the plugin sets the page size per request to the maximum allowed; however, if `QueryContext.Limit` is smaller than that page size, the page size should be set to the limit. See [example](https://github.com/turbot/steampipe-plugin-tfe/blob/253107f6d9851e14cc593ff657ddd3cb41c505bc/tfe/table_tfe_team.go#L48-L59). Hydrate function pagination If a non-List hydrate function requires paging, consider separating that data into a separate table. Columns that require separate hydrate data that uses paging can lead to throttling and rate limiting errors unexpectedly. Backoff and retry If the API SDK doesn't automate backoff and retry, the plugin leverages capabilities of the Steampipe plugin SDK's [RetryHydrate function](https://pkg.go.dev/github.com/turbot/steampipe-plugin-sdk/plugin#RetryHydrate). For instance, the `github_issue` table uses this function when [listing issues](https://github.com/turbot/steampipe-plugin-github/blob/d0a70b72e125c75940006ee6c66072c8bfa2e210/github/table_github_issue.go#L142) due to the strict throttling of the GitHub API. Maximum concurrency If the API has strict rate limiting, the table sets [HydrateConfig.MaxConcurrency](https://pkg.go.dev/github.com/turbot/steampipe-plugin-sdk/plugin#HydrateConfig.MaxConcurrency) for the relevant hydrate functions. For instance, the `googleworkspace_gmail_message` table limits the number of [getGmailMessage calls](https://github.com/turbot/steampipe-plugin-googleworkspace/blob/55686791222b02e7fb117cb398ea3fd76c2d1b1e/googleworkspace/table_googleworkspace_gmail_message.go#L49-L54). Context cancellation Each table's list hydrate function checks for remaining rows from the API SDK, and aborts inside loops (e.g., while streaming items) if there are none. (See [example](https://github.com/turbot/steampipe-plugin-aws/blob/a0050b3a27db7f61a353bc9ae38e7dd072ed87b9/aws/table_aws_cloudcontrol_resource.go#L110-L113).) ### Column Types Money Money is represented as a string, not a double which is never exact. ### Dynamic Tables Specifying tables to generate If the plugin can generate [dynamic tables](https://steampipe.io/docs/develop/writing-plugins#dynamic-tables), a configuration argument should allow users to specify which tables the plugin will generate. This configuration argument typically accepts a list of strings and should support filesystem glob patterns like in the [CSV plugin](https://hub.steampipe.io/plugins/turbot/csv#configuration). If this configuration argument is not set or is explicitly empty, e.g., `paths = []`, then no dynamic tables should be generated. Default tables The plugin should determine if it will generate dynamic tables by default after plugin installation based on if the configuration argument mentioned above is commented by default. For instance, in the [Prometheus plugin](https://github.com/turbot/steampipe-plugin-prometheus/blob/f6dbe388d729526a1a5a5b4c06d414dcc01c1548/config/prometheus.spc#L7-L14), the `metrics` configuration argument is commented. After plugin installation, the plugin will not generate dynamic tables unless the user adds a non-commented value for `metrics`. You may not want to load dynamic tables by default if it drastically increases the plugin initialization time due to the number of tables. Table name prefixes When naming dynamic tables, the plugin name prefix, e.g., `kubernetes_`, should be added if it helps avoid namespace collisions or if it helps group them with static tables that share the same prefix. ## Documentation ### Index Documentation Front matter The index document contains a front matter block, like the one below: ```yml --- organization: Turbot category: ["security"] icon_url: "/images/plugins/turbot/duo.svg" brand_color: "#6BBF4E" display_name: Duo Security name: duo description: Steampipe plugin for querying Duo Security users, logs and more. og_description: Query Duo Security with SQL! Open source CLI. No DB required. og_image: "/images/plugins/turbot/duo-social-graphic.png" --- ``` Front matter: category The category is an appropriate choice from the list at [hub.steampipe.io/plugins](https://hub.steampipe.io/plugins). Front matter: icon_url The icon URL is a link to an `.svg` file hosted on hub.steampipe.io. Please request an icon through the [Turbot Community Slack](https://turbot.com/community/join) and a URL will be provided to use in this variable. Front matter: brand color The color matches the provider's brand guidelines, typically stated on a page like [this one](https://www.twilio.com/brand/elements/colorresources) for Twilio. Plugin description The description in `docs/index.md` is appropriate for the provider. The [AWS plugin](https://hub.steampipe.io/plugins/turbot/aws), for example, uses: > AWS provides on-demand cloud computing platforms and APIs to authenticated customers on a metered pay-as-you-go basis. The opening sentence of the Wikipedia page for the provider can be a good source of guidance here. Credentials Credentials are the most important piece of documentation. The plugin: - Explains scopes and required permissions - Links to provider documentation - Explains how to use existing CLI creds when that's possible Aggregator examples For plugins that benefit from using multiple connections and aggregators, like the AWS plugin, one or more H2 sections with examples should be added to the index document so users can easily reference it. For instance, the AWS plugin has examples in [AWS Multi-Account Connections](https://hub.steampipe.io/plugins/turbot/aws#multi-account-connections). If a plugin doesn't strongly benefit from aggregator connections, an H2 section called `Multiple Connections` should be added that briefly talks about aggregators and has a link to [Using Aggregators](https://steampipe.io/docs/managing/connections#using-aggregators). ### Table Documentation Useful examples Each table document shows 4 - 5 useful examples that reflect real-world scenarios. Please see [Writing Example Queries](https://steampipe.io/docs/develop/writing-example-queries) for common patterns and samples. Column specificity Most examples specify columns. Using `SELECT *` is OK for one or two things, but generally not preferred as it can produce too much data to be helpful. See also [When Not to SELECT *](https://steampipe.io/blog/selective-select). Required columns If some columns are required, these are called out and explained. ## Final Review Testing The plugin has been tested on a real account with substantial data. Please note that errors and API throttling issues may not appear when using a test account with little data. Matching query examples The example in `README.md` matches the one in `docs/index.md`. Matching config examples The example in `config/PLUGIN.spc` matches the one in `docs/index.md#configuration`. Social graphic The social graphic is included at the top of the README file and is uploaded to the Social preview feature in the GitHub repository. Please request a social graphic through the [Steampipe Slack](https://steampipe.io/community/join). Ease of first use The plugin really nails easy setup, there's a short path to a first successful query, and it runs quickly. Pre-mortem You've considered, and addressed, reasons why this plugin could fail to delight its community. --- --- title: Table & Column Standards sidebar_label: Table & Column Standards --- # Steampipe Table & Column Standards ## Naming - Use snake_case for all table and column names. - Table names are in the format `{plugin}_{service}_{resource_type}`. Generally, table names should match the corresponding Terraform resource name. - Use singular form (not plural) for table names, e.g. `aws_s3_bucket`, not `aws_s3_buckets`. - For columns derived from nested object fields, the column should contain the path, snake cased. For example `Foo.Bar.Baz` should will be in a column named `foo_bar_baz`: ```json "foo": { "bar": { "baz": "value" } } ``` - Use Terraform as a strong inspiration for field names, when to expand arrays, etc. Being consistent with Terraform is a desirable, and minimum position. [Standard columns](#standard-columns) are an exception and should be consistent in our tables regardless of the Terraform name (they will very rarely conflict anyway) - When naming columns for which there is no direct equivalent: - Where the field contains an arn or arns, explicitly suffix with `_arn`: - Good: `attached_policy_arns` - Bad: `attached_policies` - Where the field contains an id, explicitly suffix with `_id`: - Good: `aws_account_id` - Bad: `aws_account` - Where the field contains a name but references something that may also have an id or arn, explicitly suffix with `_name`: - Good: `role_name` - Bad: `role` ## Standard Columns ALL tables that represent a resource should contain the following standard columns: | Column Name | Data Type | Description |-|-|- | `title` | `ColumnType_STRING` | The display name for this resource. | `akas` | `ColumnType_JSON` | A JSON array of AKAs (also-known-as) that uniquely identify this resource. The format of the AKAs varies by plugin (arns in aws, resource paths for azure) but they must be unique and should be immutable. | `tags` | `ColumnType_JSON` | The tags on this resource, **as a map of `key:value` pairs**. Many resources support tags, though not all in the same format. If the provider tags are in a different format, expose them in the native format in a `tags_raw` column, and convert them to `key:value` map in the `tags` column. When tags are simple labels with no key:value (like Github issue labels), use the format `label:true`. You may choose to define additional standard columns that are specific to your plugin as well, and it is recommended to do so when appropriate. For example, we define standard columns for our cloud provider plugins: - AWS - `partition` - `account_id` - `region` - Azure - `subscription_id` - `resource_group` - `region` - Google - `project` - `location` ## Data Types Use the appropriate data type so that you can search and filter intelligently. Most of this is fairly self-explanatory but there are a couple items worth pointing out: - Steampipe does not support native Postgres arrays - use `ColumnType_JSON` for arrays - There are 2 valid IP address formats, `ColumnType_IPADDR` and `ColumnType_CIDR` which correspond to Postgres inet and cidr data types: - Use `ColumnType_IPADDR` for single ip address - `10.11.12.13`. - Use `ColumnType_IPADDR` when a file can either be a single single ip address OR a cidr range - `192.168.0.0/24`, `10.11.12.13`. - Use `ColumnType_CIDR` for cidr ranges that are ALWAYS represented as a cidr - `192.168.0.0/24`, `10.11.12.13/32`. - The essential difference between `ColumnType_IPADDR` and `ColumnType_CIDR` data types is that `ColumnType_IPADDR` accepts values with nonzero bits to the right of the netmask, whereas `ColumnType_CIDR` does not. For example, `192.168.0.1` is valid for `ColumnType_IPADDR` but not for `ColumnType_CIDR`. ## Table and Column Descriptions - While technically optional, all tables and columns should contain a `Description`. This is added as a comment in the postgres schema and will be used: - To show more info within the cli in the `.inspect` command. - To generate help/reference documentation on hub.steampipe.io - The descriptions should be pretty brief (1-2 sentences), and generally should be taken from the provider's API docs. - The descriptions should start with a capital letter, and end with a period. ## Column Defaults and null In general, use `null` when a field isn't present instead of setting a default. ## Standardized Structure - Arrays should be stored in their native format as jsonb. - Fields containing an array of deep and important information (e.g. security group rules) **may** be expanded into a separate table. For example, `aws_vpc_security_groups` has an associated table of `aws_vpc_security_group_rules`. Use this model when the data is both important to query and large in scale. - Cloud providers sometimes store data in an array, even if they only ever have one value (e.g. AWS Subnet IPv6 CIDR Associations). In this case, you may choose to expand to columns as if there was a single object. - The original field (e.g. foo) should NOT be used, and should NOT have the full JSON array. Instead, we exclude the array data (it's noisy), but leave the field name available in case the provider actually uses an array in the future. - Generally, nested object fields like Foo.Bar.Baz are stored as foo_bar_baz - see [Naming](#naming) - JSON objects should be stored as `ColumnType_JSON` (jsonb), not a delimited string. If the JSON contains sub-objects that are json as string, convert to json (for example inline policies in AWS roles). - For JSON/YAML objects fields, if the raw format is also useful in itself (for example, the `template_body` in `aws_cloudformation_stack`), you may choose to create 2 columns: - `fieldname_src`: The string representation as `ColumnType_STRING`. - `fieldname`: The object representation as `ColumnType_JSON` (for joining, querying, etc). - Some JSON/YAML fields may allow multiple schema formats to represent the same object. For example, AWS IAM policies allow you to specify an array of `Action`s, or a single `Action` as a string, and are not case sensitive. In such a case, it is often useful to convert all of these objects to the same format to simplify searching and filtering. In such a case, you should keep the original object format in the `fieldname` column, and add an additional `fieldname_std` column in the standardized format. - Some fields are base64 encoded in the cloud provider's API. These can be evaluated on a case-by-case basis, but generally they should be decoded - If someone wants the column, they more than likely want to view or search the decoded text. - Key columns should appear first, then the rest added alphabetically, then "standard" columns last. Note that help (`.inspect`, online docs) order the columns alphabetically regardless of the order in the `create table` statement. --- --- title: Table Documentation Standards sidebar_label: Table Documentation Standards --- # Table Documentation Standards Creating table documentation is an important part of developing tables, as each document provides basic table information and example queries that appear on the Steampipe Hub. These example queries are especially important, as they are often the first thing a user will run to explore and understand a new table. Every table should have a Markdown document with a filename derived from the table name, e.g., `docs/tables/aws_acm_certificate.md` for `table_aws_acm_certificate.go`. Each document should include: - A header with the table name, e.g., `# Table: aws_s3_bucket` - A basic description - An `## Examples` section with multiple example queries For example, here's a table document for the `aws_s3_bucket` table: ````markdown # Table: aws_s3_bucket An Amazon S3 bucket is a public cloud storage resource available in Amazon Web Services' (AWS) Simple Storage Service (S3), an object storage offering. ## Examples ### Basic info ```sql select name, region, account_id, bucket_policy_is_public from aws_s3_bucket; ``` ### List buckets with versioning disabled ```sql select name, region, account_id, versioning_enabled from aws_s3_bucket where not versioning_enabled; ``` ### List buckets with default encryption disabled ```sql select name, server_side_encryption_configuration from aws_s3_bucket where server_side_encryption_configuration is null; ``` ```` When the plugin is packaged and deployed to the Steampipe Registry, this Markdown file will be included in the plugin documentation on the Steampipe Hub. ## Description Guidelines ### Style Conventions - Descriptions should be short (1-3 sentences) and provide basic information on the table and its resource type - All sentences in the description should have the first word capitalized and end with a period - Good - `An AWS IAM user is an entity that you create in AWS to represent the person or application that uses it to interact with.` - Bad: The first letter should be capitalized and the sentence should end with a period - `an AWS IAM user is an entity that you create in AWS to represent the person or application that uses it to interact with` - References to resource names should follow the provider's documentation on capitalization - Good - `An AWS S3 bucket is a public cloud storage resource in AWS.` - Bad: "Bucket" should not be capitalized - `An AWS S3 Bucket is a public cloud storage resource in AWS.` ## Example Query Guidelines ### Basic Info Example The first example in each document should be a basic info query. This example query should select commonly used columns from the table and should only contain the `select` and `from` keywords. Example: ````markdown ### Basic info ```sql select instance_id, instance_type, region from aws_ec2_instance; ``` ```` ### Additional Examples After the basic info query, there should be a few (but at least one) additional example queries that provide an interesting view of the table data. For more information on creating additional example queries, please see Writing Example Queries. ### Style Conventions - Use H3 (`###`) for example query descriptions, e.g., `### List my resources` - Use SQL Formatter to format all SQL queries - Indentation level should be set to `2 spaces per indent level` - SQL keywords and identifiers should be set to `Modify to lower case` - Please test your queries after formatting them in case unexpected changes are made - Example descriptions should be in the imperative mood, e.g., `List buckets that are...`, `Count the number of instances...` - Good - `### List unencrypted databases` - `### List users named foo` - Bad: Should use "List" instead of "Listing" - `### Listing unencrypted databases` - Bad: Should not include "of" - `### List of users named foo` - Example descriptions should use the plural form of the resource name if the query can return more than 1 row - Good - `### List unencrypted instances` - `### Get the instance with a specific resource ID` - Bad: "instance" should be the plural form "instances" - `### List unencrypted instance` - Bad: "instances" should be singular since only one row can be returned - `### Get the instances with a specific resource ID` - Example descriptions should follow the provider's documentation on capitalization for resource and property names - Good - `### List instances with termination protection disabled` - Bad: "Instances" should not be capitalized - `### List Instances with termination protection disabled` - Do not include the service name in descriptions unless its required to differentiate similarly named tables - Good - `### List buckets in us-east-1` - Bad: Should not include "S3" - `### List S3 buckets in us-east-1` --- --- title: Using AI for Plugin Development sidebar_label: Using AI for Plugin Development --- # Using AI for Plugin Development Creating new tables for Steampipe plugins with AI tools and IDEs works remarkably well. At Turbot, we develop plugin tables frequently and use AI for almost every new table we create. We've experimented with various approaches, including detailed prompt engineering, explicit guidelines, IDE rules and instructions, and complex workflows, but found that AI typically produces excellent results even without heavy guidance. The key to this success is working within existing plugin repositories and opening the entire repository as a folder or project in your IDE. This gives AI tools access to existing table implementations, documentation examples, code patterns, and naming conventions to generate consistent, high-quality results without extensive prompting. If you're looking to use AI to query Steampipe rather than develop new tables, you can use the [Steampipe MCP server](../query/mcp), which provides powerful tools for AI agents to inspect tables and run queries. ## Getting Started While AI often works well with simple requests like "Create a table for [resource_type]", here are some prompts we use at Turbot that you may find helpful as starting points. ### Prerequisites 1. Open the plugin repository in your IDE (Cursor, VS Code, Windsurf, etc.) to give AI tools access to all existing code and documentation. 2. Ensure you have Steampipe installed with a connection configured for the plugin. 3. Set up access to create test resources in the provider. 4. Configure the [Steampipe MCP server](https://github.com/turbot/steampipe-mcp) which allows the agent to inspect tables and run queries. ### Create Table First, create the new table and its documentation, using existing tables and docs as reference. #### Prompt ```md Your goal is to create a new Steampipe table and documentation for . 1. Review existing tables and their documentation in the plugin to understand the established patterns, naming conventions, and column structures. 2. Use `go doc` commands to understand the SDK's API structure for the resource type. 3. Create the table implementation with appropriate List/Get functions and any additional hydrate functions needed for extra API calls. Avoid hydrate functions that require paging as these belong in separate tables. 4. Register the new table in plugin.go in alphabetical order. 5. Create documentation at `docs/tables/.md`. - For Postgres queries, use `->` and `->>` operators with spaces before and after instead of `json_extract` functions. - Include resource identifiers in non-aggregate queries. ``` ### Build Plugin Next, build the plugin with your changes and verify your new table is properly registered. #### Prompt ```md Your goal is to build the plugin using the exact commands below and verify that your new table is properly registered and functional. 1. Build the plugin using `make dev` if available, otherwise use `make`. 2. Check the Steampipe service status with `steampipe service status`. Start it with `steampipe service start` if not running, or restart it with `steampipe service restart` if already running. 3. Test if the Steampipe MCP server is available by running the `steampipe_table_list` tool. 4. If the MCP server is available, use it to verify the table exists in the schema and can be queried successfully. 5. If the MCP server is not available, verify table registration manually with `steampipe query "select column_name, data_type from information_schema.columns where table_schema = '' and table_name = '' order by ordinal_position"`, then test basic querying with `steampipe query "select * from "`. ``` ### Create Test Resources To test the table's functionality, you'll need resources to query. You can either use existing resources or create new test resources with appropriate properties. #### Prompt ```md Your goal is to create test resources for to validate your Steampipe table implementation. 1. Create test resources with as many properties set as possible. - Use the provider's CLI if available, Terraform configuration if CLI isn't available, or API calls via shell script as a last resort. - Create any dependent resources needed. - Use the most cost-effective configuration. If the estimated cost is high, e.g., $50, warn about the expense rather than proceeding. 2. Verify that all resources were created successfully using the same tool or method used for creation. ``` ### Validate Column Data Next, query the table to test that columns and data types are correctly implemented. #### Prompt ```md Your goal is to thoroughly test your table implementation by validating column data and executing documentation examples. Use the Steampipe MCP server for running test queries if available, otherwise use the `steampipe` CLI commands directly. 1. Execute `select * from ` to validate that all columns return expected data based on the actual resource properties and have correct data types. 2. Test each example query from the table documentation to verify the SQL syntax is correct, queries execute without errors, and results match the example descriptions. 3. Share all test results in raw Markdown format to make them easy to export and review. ``` ### Cleanup Test Resources After testing is completed, remove any resources created for testing. #### Prompt ```md Your goal is to clean up all test resources created for validation to avoid ongoing costs. 1. Delete all resources created for testing, including any dependent resources, using the same method that was used to create them. 2. Verify that all resources were successfully deleted, using the same method that was used to delete them. ``` --- --- title: Writing Example Queries sidebar_label: Writing Example Queries --- # Writing Example Queries To help you get started on creating useful example queries, we've compiled a list of potential topics that can be used as guidelines and includes some basic and advanced examples of these from existing tables. Please note though that the topics below are just suggestions and example queries are not limited to just these topics. - Security - Access policies - Credential expiration and rotation - Encryption - Versioning - Operations - Audit logging - Data retention and backups - Tagging - Cost management - Capacity optimization - Underutilized resources ## Basic Examples ### aws_s3_bucket ````markdown ### Basic info ```sql select name, region from aws_s3_bucket; ``` ### List buckets which do not have default encryption enabled ```sql select name, server_side_encryption_configuration from aws_s3_bucket where server_side_encryption_configuration is null; ``` ### List buckets that are missing required tags ```sql select name, tags from aws_s3_bucket where tags -> 'owner' is null or tags -> 'app_id' is null; ``` ```` ### aws_ebs_volume ````markdown ### Basic info ```sql select volume_id, volume_type, encrypted, region from aws_ebs_volume; ``` ### List unencrypted volumes ```sql select volume_id, encrypted from aws_ebs_volume where not encrypted; ``` ### Count the number of volumes by volume type ```sql select volume_type, count(volume_type) as count from aws_ebs_volume group by volume_type; ``` ### List unattached volumes ```sql select volume_id, volume_type from aws_ebs_volume where attachments is null; ``` ```` ## Advanced Examples ### Joining information from the AWS EC2 instance and volume tables ````markdown ### List unencrypted volumes attached to each instance ```sql select i.instance_id, vols -> 'Ebs' ->> 'VolumeId' as vol_id, vol.encrypted from aws_ec2_instance as i cross join jsonb_array_elements(block_device_mappings) as vols join aws_ebs_volume as vol on vol.volume_id = vols -> 'Ebs' ->> 'VolumeId' where not vol.encrypted; ``` ```` ### Joining information from the Azure Compute virtual machine and network security group tables ````markdown ### Get network security group rules for all security groups attached to a virtual machine ```sql select vm.name, nsg.name, jsonb_pretty(security_rules) from azure.azure_compute_virtual_machine as vm, jsonb_array_elements(vm.network_interfaces) as vm_nic, azure_network_security_group as nsg, jsonb_array_elements(nsg.network_interfaces) as nsg_int where lower(vm_nic ->> 'id') = lower(nsg_int ->> 'id') and vm.name = 'warehouse-01'; ``` ```` ### Querying complex jsonb columns for AWS S3 buckets ````markdown ### List buckets that enforce encryption in transit ```sql select name, p as principal, a as action, s ->> 'Effect' as effect, s ->> 'Condition' as conditions, ssl from aws_s3_bucket, jsonb_array_elements(policy_std -> 'Statement') as s, jsonb_array_elements_text(s -> 'Principal' -> 'AWS') as p, jsonb_array_elements_text(s -> 'Action') as a, jsonb_array_elements_text( s -> 'Condition' -> 'Bool' -> 'aws:securetransport' ) as ssl where p = '*' and s ->> 'Effect' = 'Deny' and ssl :: bool = false; ``` ```` --- --- title: Writing Plugins sidebar_label: Writing Plugins --- # Writing Plugins The Steampipe Plugin SDK makes writing tables fast, easy, and fun! Most of the heavy lifting is taken care of for you — just define your tables and columns, wire up a few API calls, and you can start to query your service with standard SQL! While this document will provide an introduction and some examples, note that Steampipe is an evolving, open source project - refer to the code as the authoritative source, as well as for real-world examples. Also, please try to be a good community citizen — following the standards makes for a better, more consistent experience for end-users and developers alike. Let's get started! - [The Basics](#the-basics) - [Implementing Tables](#implementing-tables) - [Hydrate Functions](#hydrate-functions) - [Client-Side Rate Limiting](#client-side-rate-limiting) - [Function Tags](#function-tags) - [Accounting for Paged List calls](#accounting-for-paged-list-calls) - [Logging](#logging) - [Installing and Testing Your Plugin](#installing-and-testing-your-plugin) ---- ## The Basics ### main.go The `main` function in then `main.go` is the entry point for your plugin. This function must call `plugin.Serve` from the plugin sdk to instantiate your plugin gRPC server. You will pass the plugin function that you will create in the [plugin.go](#plugingo) file: ### Example: main.go ```go package main import ( "github.com/turbot/steampipe-plugin-sdk/v5/plugin" "github.com/turbot/steampipe-plugin-zendesk/zendesk" ) func main() { plugin.Serve(&plugin.ServeOpts{PluginFunc: zendesk.Plugin}) } ``` ### plugin.go The `plugin.go` file should implement a single [Plugin Definition](#plugin-definition) (`Plugin()` function) that returns a pointer to a `Plugin` to be loaded by the gRPC server. By convention, the package name for your plugin should be the same name as your plugin, and go files for your plugin (except `main.go`) should reside in a folder with the same name. ### Example: plugin.go ```go package zendesk import ( "context" "github.com/turbot/steampipe-plugin-sdk/v5/plugin" "github.com/turbot/steampipe-plugin-sdk/v5/plugin/transform" ) func Plugin(ctx context.Context) *plugin.Plugin { p := &plugin.Plugin{ Name: "steampipe-plugin-zendesk", DefaultTransform: transform.FromGo().NullIfZero(), TableMap: map[string]*plugin.Table{ "zendesk_brand": tableZendeskBrand(), "zendesk_group": tableZendeskGroup(), "zendesk_organization": tableZendeskOrganization(), "zendesk_search": tableZendeskSearch(), "zendesk_ticket": tableZendeskTicket(), "zendesk_ticket_audit": tableZendeskTicketAudit(), "zendesk_trigger": tableZendeskTrigger(), "zendesk_user": tableZendeskUser(), }, } return p } ``` ### Plugin Definition | Argument | Description |-|- | `Name` | The name of the plugin (`steampipe-plugin-{plugin name}`). | `TableMap` | A map of table names to [Table definitions](#implementing-tables). | `DefaultTransform` | A default [Transform Function](#transform-functions) to be used when one is not specified. While not required, this may save quite a bit of repeated code. | `DefaultGetConfig` | Provides an optional mechanism for providing plugin-level defaults to a get config. This is merged with the GetConfig defined in the table and/or columns. Typically, this is used to standardize error handling with `ShouldIgnoreError`. | `SchemaMode` | Specifies if the schema should be checked and re-imported if changed every time Steampipe starts. This can be set to `dynamic` or `static`. Defaults to `static`. | `RequiredColumns` | An optional list of columns that ALL tables in this plugin MUST implement. --- ### Table Definition The `plugin.Table` may specify: | Argument | Description |-|- | `Name` | The name of the table. | `Description` | A short description, added as a comment on the table and used in help commands and documentation. | `Columns` | An array of [column definitions](#column-definition). | `List` | A [List Config](#list-config) definition, used to fetch the data items used to build all rows of a table. | `Get` | A [Get Config](#get-config) definition, used to fetch a single item. | `DefaultTransform` | A default [transform function](#transform-functions) to be used when one is not specified. If set, this will override the default set in the plugin definition. | `HydrateDependencies` | Definitions of dependencies between hydrate functions (for cases where a hydrate function needs the results of another hydrate function). ### List Config A ListConfig definition defines how to list all rows of a table. | Argument | Description |-|- | `KeyColumns` | An optional list of columns that require a qualifier in order to list data for this table. | `Hydrate` | A [hydrate function](#hydrate-functions) which is called first when performing a 'list' call. | `ParentHydrate` | An optional parent list function - if you list items with a parent-child relationship, this will list the parent items. ### Get Config A GetConfig definition defines how to get a single row of a table. | Argument | Description |-|- | `KeyColumns` | A list of keys which are used to uniquely identify rows - used to determine whether a query is a 'get' call. | `ItemFromKey [DEPRECATED]`] | This property is deprecated. | `Hydrate` | A [hydrate function](#hydrate-functions) which is called first when performing a 'get' call. If this returns 'not found', no further hydrate functions are called. | `ShouldIgnoreError` | A function which will return whether to ignore a given error. ### Column Definition A column definition definition specifies the name and description of the column, its data type, and the functions to call to hydrate the column (if the list call does not) and transform it (if the default transformation is not sufficient). | Argument | Description |-|- | `Name` | The column name. | `Type` | The [data type](#column-data-types) for this column. | `Description` | The column description, added as a comment and used in help commands and documentation. | `Hydrate` | You can explicitly specify the [hydrate function](#hydrate-functions) function to populate this column. This is only needed if neither the default hydrate functions nor the `List` function return data for this column. | `Default` | An optional default column value. | `Transform` | An optional chain of [transform functions](#transform-functions) to generate the column value. ### Column Data Types Currently supported data types are: | Name | Type |-|- | `ColumnType_BOOL` | Boolean | `ColumnType_INT` | Integer | `ColumnType_DOUBLE` | Double precision floating point | `ColumnType_STRING` | String | `ColumnType_JSON` | JSON | `ColumnType_DATETIME` | Date/Time (Deprecated - use ColumnType_TIMESTAMP) | `ColumnType_TIMESTAMP` | Date/Time | `ColumnType_IPADDR` | IP Address | `ColumnType_CIDR` | IP network CIDR | `ColumnType_UNKNOWN` | Unknown | `ColumnType_INET` | Either an IP Address or an IP network CIDR | `ColumnType_LTREE` | [Ltree](https://www.postgresql.org/docs/current/ltree.html) --- ## Implementing Tables By convention, each table should be implemented in a separate file named `table_{table name}.go`. Each table will have a single table definition function that returns a pointer to a `plugin.Table` (this is the function specified in the `TableMap` of the [plugin definition](#plugin-definition)). The function name is typically the table name in camel case (per golang standards) prefixed by `table`. The table definition specifies the name and description of the table, a list of column definitions, and the functions to call in order to list the data for all the rows, or to get data for a single row. When a connection is created, Steampipe uses the table and column definitions to create the Postgres foreign tables, however the tables don't store the data — the data is populated (hydrated) when a query is run. The basic flow is: 1. A user runs a steampipe query against the database 1. Postgres parses the query and sends the parsed request to the Steampipe FDW. 1. The Steampipe Foreign Data Wrapper (Steampipe FDW) determines what tables and columns are required. 1. The FDW calls the appropriate [Hydrate Functions](#hydrate-functions) in the plugin, which fetch the appropriate data from the API, cloud provider, etc. - Each table defines two special hydrate functions, `List` and `Get`. The `List` or `Get` will always be called before any other hydrate function in the table, as the other functions typically depend on the result of the Get or List call. - Whether `List` or `Get` is called depends upon whether the qualifiers (in `where` clauses and `join...on`) match the `KeyColumns`. This allows Steampipe to fetch only the "row" data that it needs. Qualifiers (aka quals) enable Steampipe to map a Postgres constraint (e.g. `where created_at > date('2023-01-01')`) to the API parameter (e.g. `since=1673992596000`) that the plugin's supporting SDK uses to fetch results matching the Postgres constraint. See [How To enhance a plugin with a new table that supports 'quals'](https://steampipe.io/blog/vercel-table) for a complete example. - Multiple columns may (and usually do) get built from the same hydrate function, but steampipe only calls the hydrate functions for the columns requested (specified in the `select`, `join`, or `where`). This allows steampipe to call only those APIs for the "column" data requested in the query. 1. The [Transform Functions](#transform-functions) are called for each column. The transform functions extract and/or reformat data returned by the hydrate functions into the format to be returned in the column. 1. The plugin returns the transformed data to the Steampipe FDW 1. Steampipe FDW returns the results to the database ## Hydrate Functions A hydrate function connects to an external system or service and gathers data to fill a database table. `Get` and `List` are hydrate functions, defined in the [Table Definition](#table-definition), that have these special characteristics: - Every table ***must*** define a `List` and/or `Get` function. - The `List` or `Get` will always be called before any other hydrate function in the table, as the other functions typically depend on the result of the `Get` or `List` call. - Whether `List` or `Get` is called depends upon whether the qualifiers (in `where` clauses and `join...on`) match the `KeyColumns` defined in the [Get Config](#get-config). This enables Steampipe to fetch only the "row" data that it needs. - Typically, hydrate functions return a single data item (data for a single row). *List functions are an exception* — they stream data for multiple rows using the [QueryData](https://github.com/turbot/steampipe-plugin-sdk/blob/HEAD/plugin/query_data.go) object, and return `nil`. - The `Get` function will usually get the key column data from the `QueryData.KeyColumnQuals` so that it can get the appropriate item as based on the qualifiers (`where` clause, `join...on`). If the `Get` hydrate function is used as both a `Get` function AND a normal hydrate function, you should get the key column data from the `HydrateData.Item` if it is not nil, and use the `QueryData.KeyColumnQuals` otherwise. ### About List Functions A `List` function retrieves all the items of a particular resource type from an API. For example, the [github_my_gist](https://hub.steampipe.io/plugins/turbot/github/tables/github_my_gist) table supports the query: ```sql select * from github_my_gist ``` The function `tableGitHubMyGist` [defines the table](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_my_gist.go#L10-L19) like so. ```go func tableGitHubMyGist() *plugin.Table { return &plugin.Table{ Name: "github_my_gist", Description: "GitHub Gists owned by you. GitHub Gist is a simple way to share snippets and pastes with others.", List: &plugin.ListConfig{ Hydrate: tableGitHubMyGistList, }, Columns: gitHubGistColumns(), } } ``` The table's `List` property refers, by way of the `Hydrate` property, to a Steampipe function that lists gists, [tableGitHubMyGistList](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_my_gist.go#L21-L58). That function calls the GitHub Go SDK's [GistsService.List](https://pkg.go.dev/github.com/google/go-github/v60/github#GistsService.List) and returns an array of pointers to items of type `Gist` as [defined](https://pkg.go.dev/github.com/google/go-github/v60/github#Gist) in the Go SDK. ```go type Gist struct { ID *string `json:"id,omitempty"` Description *string `json:"description,omitempty"` Public *bool `json:"public,omitempty"` Owner *User `json:"owner,omitempty"` Files map[GistFilename]GistFile `json:"files,omitempty"` Comments *int `json:"comments,omitempty"` HTMLURL *string `json:"html_url,omitempty"` GitPullURL *string `json:"git_pull_url,omitempty"` GitPushURL *string `json:"git_push_url,omitempty"` CreatedAt *Timestamp `json:"created_at,omitempty"` UpdatedAt *Timestamp `json:"updated_at,omitempty"` NodeID *string `json:"node_id,omitempty"` } ``` The `Columns` property in `tableGitHubMyGist` refers to the function `gitHubGistColumns`, which is shared with a related table [table_github_gist](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_gist.go#L13-L32), and which maps that Go schema to this database schema. ```go func gitHubGistColumns() []*plugin.Column { return []*plugin.Column{ // Top columns {Name: "id", Type: proto.ColumnType_STRING, Description: "The unique id of the gist."}, {Name: "description", Type: proto.ColumnType_STRING, Description: "The gist description."}, {Name: "public", Type: proto.ColumnType_BOOL, Description: "If true, the gist is public, otherwise it is private."}, {Name: "html_url", Type: proto.ColumnType_STRING, Description: "The HTML URL of the gist."}, {Name: "comments", Type: proto.ColumnType_INT, Description: "The number of comments for the gist."}, {Name: "created_at", Type: proto.ColumnType_TIMESTAMP, Transform: transform.FromField("CreatedAt").Transform(convertTimestamp), Description: "The timestamp when the gist was created."}, {Name: "git_pull_url", Type: proto.ColumnType_STRING, Description: "The https url to pull or clone the gist."}, {Name: "git_push_url", Type: proto.ColumnType_STRING, Description: "The https url to push the gist."}, {Name: "node_id", Type: proto.ColumnType_STRING, Description: "The Node ID of the gist."}, // Only load relevant fields from the owner {Name: "owner_id", Type: proto.ColumnType_INT, Description: "The user id (number) of the gist owner.", Transform: transform.FromField("Owner.ID")}, {Name: "owner_login", Type: proto.ColumnType_STRING, Description: "The user login name of the gist owner.", Transform: transform.FromField("Owner.Login")}, {Name: "owner_type", Type: proto.ColumnType_STRING, Description: "The type of the gist owner (User or Organization).", Transform: transform.FromField("Owner.Type")}, {Name: "updated_at", Type: proto.ColumnType_TIMESTAMP, Transform: transform.FromField("UpdatedAt").Transform(convertTimestamp), Description: "The timestamp when the gist was last updated."}, {Name: "files", Type: proto.ColumnType_JSON, Transform: transform.FromField("Files").Transform(gistFileMapToArray), Description: "Files in the gist."}, } } ``` Here's the [tableGitHubMyGistList](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_my_gist.go#L21-#L58) function. ```go func tableGitHubMyGistList(ctx context.Context, d *plugin.QueryData, h *plugin.HydrateData) (interface{}, error) { client := connect(ctx, d) opt := &github.GistListOptions{ListOptions: github.ListOptions{PerPage: 100}} limit := d.QueryContext.Limit // the SQL LIMIT if limit != nil { if *limit < int64(opt.ListOptions.PerPage) { opt.ListOptions.PerPage = int(*limit) } } for { gists, resp, err := client.Gists.List(ctx, "", opt) // call https://pkg.go.dev/github.com/google/go-github/v60/github#GistsService.List if err != nil { return nil, err } for _, i := range gists { if i != nil { d.StreamListItem(ctx, i) // send the item to steampipe } // Context can be cancelled due to manual cancellation or the limit has been hit if d.RowsRemaining(ctx) == 0 { return nil, nil } } if resp.NextPage == 0 { break } opt.Page = resp.NextPage } return nil, nil } ``` A Steampipe `List` function is one of two special forms of [hydrate function](/docs/develop/writing-plugins#hydrate-functions) — `Get` is the other — that take precedence over other [hydrate functions](https://pkg.go.dev/github.com/turbot/steampipe-plugin-sdk/v5/plugin#HydrateFunc) which are declared using the `HydrateConfig` property of a table definition. ### About Get Functions A `Get` function fetches a single item by its key. While it's possible to define a table that only uses `Get`, the common pattern combines `List` to retrieve basic data and `Get` to enrich it. For example, here's the definition of the table [github_gitignore](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_gitignore.go#L12-L30). ```go func tableGitHubGitignore() *plugin.Table { return &plugin.Table{ Name: "github_gitignore", Description: "GitHub defined .gitignore templates that you can associate with your repository.", List: &plugin.ListConfig{ Hydrate: tableGitHubGitignoreList, }, Get: &plugin.GetConfig{ KeyColumns: plugin.SingleColumn("name"), ShouldIgnoreError: isNotFoundError([]string{"404"}), Hydrate: tableGitHubGitignoreGetData, }, Columns: []*plugin.Column{ // Top columns {Name: "name", Type: proto.ColumnType_STRING, Description: "Name of the gitignore template."}, {Name: "source", Type: proto.ColumnType_STRING, Hydrate: tableGitHubGitignoreGetData, Description: "Source code of the gitignore template."}, }, } } ```` In this case, the `source` column data is not included in the API response from the `tableGitHubGitignoreList` function. So the `tableGitHubGitignoreGetData` function is specified as the `Hydrate` function for that column. The `List` function, [tableGitHubGitignoreList](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_gitignore.go#L32-L51), calls the SDK's [GitignoresService.List](https://pkg.go.dev/github.com/google/go-github/v55@v55.0.0/github#GitignoresService.List) which returns an array of strings which are the names of [.gitignore templates](https://docs.github.com/en/rest/gitignore?apiVersion=2022-11-28#listing-available-templates). The `Get` function, [tableGitHubGitignoreGetData](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_gitignore.go#L53-L75), receives the name of a template and calls the SDK's [GitignoresService.Get](https://pkg.go.dev/github.com/google/go-github/v55@v55.0.0/github#GitignoresService.Get) to return an item of type [GitIgnore](https://pkg.go.dev/github.com/google/go-github/v55@v55.0.0/github#Gitignore), which corresponds to the `Columns` in the table definition. ```go type Gitignore struct { Name *string `json:"name,omitempty"` Source *string `json:"source,omitempty"` } ``` For example, here's the result for `select * from github_gitignore where name = 'go'`. ``` +------+-------------------------------------------------------------------------------------------+ | name | source | +------+-------------------------------------------------------------------------------------------+ | Go | # If you prefer the allow list template instead of the deny list, see community template: | | | # https://github.com/github/gitignore/blob/main/community/Golang/Go.AllowList.gitignore | | | # | | | # Binaries for programs and plugins | | | *.exe | | | *.exe~ | | | *.dll | | | *.so | | | *.dylib | +------+-------------------------------------------------------------------------------------------+ ``` The `List` function finds all the names of the templates provided by GitHub, and the `Get` function adds the `source` column. #### When the column definition doesn't need to specify a `Hydrate` When the underlying SDK functions for a `List` and `Get` both return complete information, the column definition doesn't need to specify a `Hydrate`. For example, here's the definition for [table_github_actions_artifact](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_actions_artifact.go#L13-L42). ```go func tableGitHubActionsArtifact() *plugin.Table { return &plugin.Table{ Name: "github_actions_artifact", Description: "Artifacts allow you to share data between jobs in a workflow and store data once that workflow has completed.", List: &plugin.ListConfig{ KeyColumns: plugin.SingleColumn("repository_full_name"), ShouldIgnoreError: isNotFoundError([]string{"404"}), Hydrate: tableGitHubArtifactList, }, Get: &plugin.GetConfig{ KeyColumns: plugin.AllColumns([]string{"repository_full_name", "id"}), ShouldIgnoreError: isNotFoundError([]string{"404"}), Hydrate: tableGitHubArtifactGet, }, Columns: []*plugin.Column{ // Top columns {Name: "repository_full_name", Type: proto.ColumnType_STRING, Transform: transform.FromQual("repository_full_name"), Description: "Full name of the repository that contains the artifact."}, {Name: "name", Type: proto.ColumnType_STRING, Description: "The name of the artifact."}, {Name: "id", Type: proto.ColumnType_INT, Description: "Unique ID of the artifact."}, {Name: "size_in_bytes", Type: proto.ColumnType_INT, Description: "Size of the artifact in bytes."}, // Other columns {Name: "archive_download_url", Type: proto.ColumnType_STRING, Transform: transform.FromField("ArchiveDownloadURL"), Description: "Archive download URL for the artifact."}, {Name: "created_at", Type: proto.ColumnType_TIMESTAMP, Transform: transform.FromField("CreatedAt").Transform(convertTimestamp), Description: "Time when the artifact was created."}, {Name: "expired", Type: proto.ColumnType_BOOL, Description: "It defines whether the artifact is expires or not."}, {Name: "expires_at", Type: proto.ColumnType_TIMESTAMP, Transform: transform.FromField("ExpiresAt").Transform(convertTimestamp), Description: "Time when the artifact expires."}, {Name: "node_id", Type: proto.ColumnType_STRING, Description: "Node where GitHub stores this data internally."}, }, } } ``` The SDK's [ListArtifacts](https://pkg.go.dev/github.com/google/go-github/v55@v55.0.0/github#ActionsService.ListArtifacts) returns an array of [Artifact](https://pkg.go.dev/github.com/google/go-github/v55@v55.0.0/github#Artifact) and its [GetArtifact](https://pkg.go.dev/github.com/google/go-github/v55@v55.0.0/github#ActionsService.GetArtifact) returns a single `Artifact` object. As with `tableGitHubGitignore`, these are separate APIs — wrapped by the Go SDK — to [list basic info](https://docs.github.com/en/rest/actions/artifacts?apiVersion=2022-11-28#list-artifacts-for-a-repository) and [get details](https://docs.github.com/en/rest/actions/artifacts?apiVersion=2022-11-28#get-an-artifact) artifacts. If the query's `where` or `join...on` specifies an `id`, the plugin will use the optimal `Get` function, otherwise the `List` function, to call the corresponding APIs. Either way, the same API response matches the schema declared in `Columns`. ```go type Artifact struct { ID *int64 `json:"id,omitempty"` NodeID *string `json:"node_id,omitempty"` Name *string `json:"name,omitempty"` SizeInBytes *int64 `json:"size_in_bytes,omitempty"` URL *string `json:"url,omitempty"` ArchiveDownloadURL *string `json:"archive_download_url,omitempty"` Expired *bool `json:"expired,omitempty"` CreatedAt *Timestamp `json:"created_at,omitempty"` UpdatedAt *Timestamp `json:"updated_at,omitempty"` ExpiresAt *Timestamp `json:"expires_at,omitempty"` WorkflowRun *ArtifactWorkflowRun `json:"workflow_run,omitempty"` } ``` #### When Steampipe calls `List` vs `Get` Which function is called when you query the `github_actions_artifact` table? It depends! We can use [diagnostic mode](https://steampipe.io/docs/guides/limiter#exploring--troubleshooting-with-diagnostic-mode) to explore. This query, which lists all the artifacts in a repo, uses the `List` function `tableGitHubArtifactList`. ``` STEAMPIPE_DIAGNOSTIC_LEVEL=all steampipe service start > select jsonb_pretty(_ctx) from github_actions_artifact where repository_full_name = 'turbot/steampipe-plugin-github' +----------------------------------------------------------------+ | jsonb_pretty | +----------------------------------------------------------------+ | { | | "steampipe": { | | "sdk_version": "5.8.0" | | }, | | "diagnostics": { | | "calls": [ | | { | | "type": "list", | | "scope_values": { | | "table": "github_actions_artifact", | | "connection": "github", | | "function_name": "tableGitHubArtifactList" | | }, | | "function_name": "tableGitHubArtifactList", | | "rate_limiters": [ | | ], | | "rate_limiter_delay_ms": 0 | | } | | ] | | }, | | "connection_name": "github" | | } | ``` This query, which uses the qualifier `id`, uses the `Get` function `tableGitHubArtifactGet`. ``` > select jsonb_pretty(_ctx) from github_actions_artifact where id = '1248325644' and repository_full_name = 'turbot/steampipe-plugin-github' +---------------------------------------------------------------+ | jsonb_pretty | +---------------------------------------------------------------+ | { | | "steampipe": { | | "sdk_version": "5.8.0" | | }, | | "diagnostics": { | | "calls": [ | | { | | "type": "get", | | "scope_values": { | | "table": "github_actions_artifact", | | "connection": "github", | | "function_name": "tableGitHubArtifactGet" | | }, | | "function_name": "tableGitHubArtifactGet", | | "rate_limiters": [ | | ], | | "rate_limiter_delay_ms": 0 | | } | | ] | | }, | | "connection_name": "github" | | } | +---------------------------------------------------------------+ ``` This works because `id` is one of the `KeyColumns` in the `Get` property of the table definition. That enables the [Steampipe plugin SDK](https://github.com/turbot/steampipe-plugin-sdk) to choose the more optimal `tableGitHubArtifactGet` function when the `id` is known and it isn't necessary to list all artifacts in order to retrieve just a single one. ### List or Get in Combination with Hydrate In addition to to the special `List` and `Get` hydrate functions, there's a class of general hydrate functions that enrich what's returned by `List` or `Get`. In `table_aws_cloudtrail_trail.go`, [getCloudTrailStatus](https://github.com/turbot/steampipe-plugin-aws/blob/40058d8fd15a677214cfa3e22de35cde707775e7/aws/table_aws_cloudtrail_trail.go#L329-L369) is an example of this kind of function. Steampipe knows it's a `HydrateFunc` because the table definition declares it in the [HydrateConfig](https://github.com/turbot/steampipe-plugin-aws/blob/40058d8fd15a677214cfa3e22de35cde707775e7/aws/table_aws_cloudtrail_trail.go#L42-L46) property of the table definition. ```go HydrateConfig: []plugin.HydrateConfig{ { Func: getCloudtrailTrailStatus, Tags: map[string]string{"service": "cloudtrail", "action": "GetTrailStatus"}, }, ... }, ``` A `HydrateFunc` is typically used in combination with `List` or `Get`. For example, the `List` function for `table_aws_fms_app_list.go` uses the SDK's [NewListAppsListsPaginator](https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/fms@v1.24.3#NewListAppsListsPaginator) to get [basic info](https://github.com/turbot/steampipe-plugin-aws/blob/40058d8fd15a677214cfa3e22de35cde707775e7/aws/table_aws_fms_app_list.go#L43-L60) declared in the `Columns` property of the table definition. ```go { Name: "list_name", Description: "The name of the applications list.", Type: proto.ColumnType_STRING, Transform: transform.FromField("ListName", "AppsList.ListName"), }, { Name: "list_id", Description: "The ID of the applications list.", Type: proto.ColumnType_STRING, Transform: transform.FromField("ListId", "AppsList.ListId"), }, { Name: "arn", Description: "The Amazon Resource Name (ARN) of the applications list.", Type: proto.ColumnType_STRING, Transform: transform.FromField("ListArn", "AppsListArn"), }, ``` These correspond to the type [AppsListDataSummary](https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/fms@v1.24.3/types#AppsListDataSummary) in the AWS SDK. The `Columns` property also declares [four other columns](https://github.com/judell/steampipe-plugin-aws/blob/HEAD/aws/table_aws_fms_app_list.go#L61-L90) that use the `HydrateFunc` called [getFMSAppList](https://github.com/judell/steampipe-plugin-aws/blob/40058d8fd15a677214cfa3e22de35cde707775e7/aws/table_aws_fms_app_list.go#L164-L204). ```go { Name: "create_time", Description: "The time that the Firewall Manager applications list was created.", Type: proto.ColumnType_TIMESTAMP, Hydrate: getFmsAppList, }, { Name: "last_update_time", Description: "The time that the Firewall Manager applications list was last updated.", Type: proto.ColumnType_TIMESTAMP, Hydrate: getFmsAppList, }, { Name: "list_update_token", Description: "A unique identifier for each update to the list. When you update the list, the update token must match the token of the current version of the application list.", Type: proto.ColumnType_STRING, Hydrate: getFmsAppList, }, { Name: "previous_apps_list", Description: "A map of previous version numbers to their corresponding App object arrays.", Type: proto.ColumnType_JSON, Hydrate: getFmsAppList, }, { Name: "apps_list", Description: "An array of applications in the Firewall Manager applications list.", Type: proto.ColumnType_JSON, Hydrate: getFmsAppList, }, ``` Those columns correspond to fields of the type [AppsListData](https://github.com/aws/aws-sdk-go-v2/blob/8d9a27a085ae3d026a8fa910d30d7eb51221ab15/service/fms/types/types.go#L137-L167) in the AWS SDK. ```go type AppsListData struct { // An array of applications in the Firewall Manager applications list. // // This member is required. AppsList []App // The name of the Firewall Manager applications list. // // This member is required. ListName *string // The time that the Firewall Manager applications list was created. CreateTime *time.Time // The time that the Firewall Manager applications list was last updated. LastUpdateTime *time.Time // The ID of the Firewall Manager applications list. ListId *string // A unique identifier for each update to the list. When you update the list, the // update token must match the token of the current version of the application // list. You can retrieve the update token by getting the list. ListUpdateToken *string // A map of previous version numbers to their corresponding App object arrays. PreviousAppsList map[string][]App } ``` ### HydrateConfig Use `HydrateConfig` in a table definition to provide granular control over the behavior of a hydrate function. Things you can control with a `HydrateConfig`: - Errors to ignore. - Errors to retry. - Max concurrent calls to allow. - Hydrate dependencies - Rate-limiter tags For a `Get` or `List`, you can specify errors to ignore and/or retry using `DefaultIgnoreConfig` and `DefaultRetryConfig` as seen here in [the Fastly plugin](https://github.com/turbot/steampipe-plugin-fastly/blob/550922bae7bc066e12ddd7634d96c9dd33374eed/fastly/plugin.go#L20-L22). ```go func Plugin(ctx context.Context) *plugin.Plugin { p := &plugin.Plugin{ Name: "steampipe-plugin-fastly", ConnectionConfigSchema: &plugin.ConnectionConfigSchema{ NewInstance: ConfigInstance, }, DefaultTransform: transform.FromGo().NullIfZero(), DefaultIgnoreConfig: &plugin.IgnoreConfig{ ShouldIgnoreErrorFunc: shouldIgnoreErrors([]string{"404"}), }, DefaultRetryConfig: &plugin.RetryConfig{ ShouldRetryErrorFunc: shouldRetryError([]string{"429"}), }, TableMap: map[string]*plugin.Table{ "fastly_acl": tableFastlyACL(ctx), ... "fastly_token": tableFastlyToken(ctx), }, } return p } ``` For other hydrate functions, you do this with `HydrateConfig`. Here's how the `oci_identity_tenancy` table [configures error handling](https://github.com/turbot/steampipe-plugin-oci/blob/4403adee869853b3d205e8d93681af0859870701/oci/table_oci_identity_tenancy.go#L23-28) for the `getRetentionPeriod` function. ```go HydrateConfig: []plugin.HydrateConfig{ { Func: getRetentionPeriod, ShouldIgnoreError: isNotFoundError([]string{"404"}), }, }, ``` You can similarly use `ShouldRetryError` along with a corresponding function that returns true if, for example, an API call its a rate limit. ```go func shouldRetryError(err error) bool { if cloudflareErr, ok := err.(*cloudflare.APIRequestError); ok { return cloudflareErr.ClientRateLimited() } return false } ``` You can likewise use `MaxConcurrency` to limit the number of calls to a hydrate function. In practice, the granular controls afforded by `ShouldIgnoreError`, `ShouldRetryError`, and `MaxConcurrency` are not much used at the level of individual hydrate functions. Plugins are likelier to assert such control globally. But the flexibility is threre if you need it. Two features of `HydrateConfig` that are used quite a bit are `Depends` and `Tags`. Use `Depends` to make a function depend on one or more others. In `aws_s3_bucket`, the function [getBucketLocation](https://github.com/turbot/steampipe-plugin-aws/blob/66bd381dfaccd3d16ccedba660cd05adaa17c7d7/aws/table_aws_s3_bucket.go#L399-L440) returns the client region that's needed by all the other functions, so they all [depend on it](https://github.com/turbot/steampipe-plugin-aws/blob/66bd381dfaccd3d16ccedba660cd05adaa17c7d7/aws/table_aws_s3_bucket.go#L27-L102). ```go HydrateConfig: []plugin.HydrateConfig{ { Func: getBucketLocation, Tags: map[string]string{"service": "s3", "action": "GetBucketLocation"}, }, { Func: getBucketIsPublic, Depends: []plugin.HydrateFunc{getBucketLocation}, Tags: map[string]string{"service": "s3", "action": "GetBucketPolicyStatus"}, }, { Func: getBucketVersioning, Depends: []plugin.HydrateFunc{getBucketLocation}, Tags: map[string]string{"service": "s3", "action": "GetBucketVersioning"}, }, ``` Use `Tags` to expose a hydrate function to control by a limiter. In AWS plugin's, `aws_config_rule` table, the `HydrateConfig` specifies [additional hydrate functions](https://github.com/turbot/steampipe-plugin-aws/blob/66bd381dfaccd3d16ccedba660cd05adaa17c7d7/aws/table_aws_config_rule.go#L40-L49) that fetch tags and compliance details for each config rule. ```go HydrateConfig:plugin.HydrateConfig{ { Func: getConfigRuleTags, Tags: map[string]string{"service": "config", "action": "ListTagsForResource"}, }, { Func: getComplianceByConfigRules, Tags: map[string]string{"service": "config", "action": "DescribeComplianceByConfigRule"}, }, }, ``` In this example the `Func` property names `getConfigRuleTags` and `getComplianceByConfigRules` as additional hydrate functions that fetch tags and compliance details for each config rule, respectively. The `Tags` property enables a rate limiter to [target these functions](https://steampipe.io/docs/guides/limiter#function-tags). (See also [function-tags](#function-tags) below.) ### Memoize: Caching hydrate results The [Memoize](https://github.com/judell/steampipe-plugin-sdk/blob/HEAD/plugin/hydrate_cache.go#L61-L139) function can be used to cache the results of a `HydrateFunc`. In the [multi_region.go](https://github.com/turbot/steampipe-plugin-aws/blob/main/aws/multi_region.go) file of `steampipe-plugin-aws` repository, the `listRegionsForServiceCacheKey` function is used to create a custom cache key for the `listRegionsForService` function. This cache key includes the service ID, which is unique for each AWS service. Here's a simplified version of the code: ```go func listRegionsForServiceCacheKey(ctx context.Context, d *plugin.QueryData, h *plugin.HydrateData) (interface{}, error) { serviceID := h.Item.(string) key := fmt.Sprintf("listRegionsForService-%s", serviceID) return key, nil } var listRegionsForService = plugin.HydrateFunc(listRegionsForServiceUncached).Memoize(memoize.WithCacheKeyFunction(listRegionsForServiceCacheKey)) ``` In this example, `Memoize` caches the results of `listRegionsForServiceUncached`. The `WithCacheKeyFunction` option specifies a custom function (`listRegionsForServiceCacheKey`) to generate the cache key. This function takes the service ID from the hydrate data and includes it in the cache key, ensuring a unique cache key for each AWS service. This is a common pattern when using `Memoize`: you define a `HydrateFunc` and then wrap it with `Memoize` to enable caching. You can also use the `WithCacheKeyFunction` option to specify a custom function that generates the cache key, which is especially useful when you need to include additional context in the cache key. ### Transform Functions Transform functions are used to extract and/or reformat data returned by a hydrate function into the desired type/format for a column. You can call your own transform function with `From`, but you probably don't need to write one -- The SDK provides many that cover the most common cases. You can chain transforms together, but the transform chain must be started with a `From` function: | Name | Description |-|- | `FromConstant` | Return a constant value (specified by 'param'). | `FromField` | Generate a value by retrieving a field from the source item. | `FromValue` | Generate a value by returning the raw hydrate item. | `FromCamel` | Generate a value by converting the given field name to camel case and retrieving from the source item. | `FromGo` | Generate a value by converting the given field name to camel case and retrieving from the source item. | `From` | Generate a value by calling a 'transformFunc'. | `FromJSONTag` | Generate a value by finding a struct property with the json tag matching the column name. | `FromTag` | Generate a value by finding a struct property with the tag 'tagName' matching the column name. | `FromP` | Generate a value by calling 'transformFunc' passing param. Additional functions can be chained after a `From` function to transform the data: | Name | Description |-|- | `Transform` | Apply an arbitrary transform to the data (specified by 'transformFunc'). | `TransformP` | Apply an arbitrary transform to the data, passing a parameter. | `NullIfEqual` | If the input value equals the transform param, return nil. | `NullIfZero` | If the input value equals the zero value of its type, return nil. ### Translating SQL Operators to API Calls When you write SQL that resolves to API calls, you want a SQL operator like `>` to influence an API call in the expected way. Consider this query: ``` SELECT * FROM github_issue WHERE updated_at > '2022-01-01' ``` You would like the underlying API call to filter accordingly. In order to intercept the SQL operator, and implement it in your table code, you [declare it](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_issue.go#L75-L93) in the `KeyColumns` property of the table. ``` KeyColumns: []*plugin.KeyColumn{ { Name: "repository_full_name", Require: plugin.Required, }, { Name: "author_login", Require: plugin.Optional, }, { Name: "state", Require: plugin.Optional, }, { Name: "updated_at", Require: plugin.Optional, Operators: []string{">", ">="}, // declare operators your get/list/hydrate function handles }, ``` Then, in your table code, you write a handler for the column. The handler configures the API to [filter on one or more operators](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_issue.go#L135-L147). ``` if d.Quals["updated_at"] != nil { for _, q := range d.Quals["updated_at"].Quals { givenTime := q.Value.GetTimestampValue().AsTime() // timestamp from the SQL query afterTime := givenTime.Add(time.Second * 1) // one second after the given time switch q.Operator { case ">": filters.Since = githubv4.NewDateTime(githubv4.DateTime{Time: afterTime}) // handle WHERE updated_at > '2022-01-01' case ">=": filters.Since = githubv4.NewDateTime(githubv4.DateTime{Time: givenTime}) // handle WHERE updated_at >= '2022-01-01' } } } ``` ### Example: Table Definition File ```go package zendesk import ( "context" "github.com/nukosuke/go-zendesk/zendesk" "github.com/turbot/steampipe-plugin-sdk/v5/grpc/proto" "github.com/turbot/steampipe-plugin-sdk/v5/plugin" ) func tableZendeskUser() *plugin.Table { return &plugin.Table{ Name: "zendesk_user", Description: "Zendesk Support has three types of users: end users (your customers), agents, and administrators.", List: &plugin.ListConfig{ Hydrate: listUser, }, Get: &plugin.GetConfig{ KeyColumns: plugin.SingleColumn("id"), Hydrate: getUser, }, Columns: []*plugin.Column{ {Name: "active", Type: proto.ColumnType_BOOL, Description: "False if the user has been deleted"}, {Name: "alias", Type: proto.ColumnType_STRING, Description: "An alias displayed to end users"}, {Name: "chat_only", Type: proto.ColumnType_BOOL, Description: "Whether or not the user is a chat-only agent"}, {Name: "created_at", Type: proto.ColumnType_TIMESTAMP, Description: "The time the user was created"}, {Name: "custom_role_id", Type: proto.ColumnType_INT, Description: "A custom role if the user is an agent on the Enterprise plan"}, {Name: "default_group_id", Type: proto.ColumnType_INT, Description: "The id of the user's default group"}, {Name: "details", Type: proto.ColumnType_STRING, Description: "Any details you want to store about the user, such as an address"}, {Name: "email", Type: proto.ColumnType_STRING, Description: "The user's primary email address. *Writeable on create only. On update, a secondary email is added."}, {Name: "external_id", Type: proto.ColumnType_STRING, Description: "A unique identifier from another system. The API treats the id as case insensitive. Example: \"ian1\" and \"Ian1\" are the same user"}, {Name: "id", Type: proto.ColumnType_INT, Description: "Automatically assigned when the user is created"}, {Name: "last_login_at", Type: proto.ColumnType_TIMESTAMP, Description: "The last time the user signed in to Zendesk Support"}, {Name: "locale", Type: proto.ColumnType_STRING, Description: "The user's locale. A BCP-47 compliant tag for the locale. If both \"locale\" and \"locale_id\" are present on create or update, \"locale_id\" is ignored and only \"locale\" is used."}, {Name: "locale_id", Type: proto.ColumnType_INT, Description: "The user's language identifier"}, {Name: "moderator", Type: proto.ColumnType_BOOL, Description: "Designates whether the user has forum moderation capabilities"}, {Name: "name", Type: proto.ColumnType_STRING, Description: "The user's name"}, {Name: "notes", Type: proto.ColumnType_STRING, Description: "Any notes you want to store about the user"}, {Name: "only_private_comments", Type: proto.ColumnType_BOOL, Description: "true if the user can only create private comments"}, {Name: "organization_id", Type: proto.ColumnType_INT, Description: "The id of the user's organization. If the user has more than one organization memberships, the id of the user's default organization"}, {Name: "phone", Type: proto.ColumnType_STRING, Description: "The user's primary phone number."}, {Name: "photo_content_type", Type: proto.ColumnType_STRING, Description: "The content type of the image. Example value: \"image/png\""}, {Name: "photo_content_url", Type: proto.ColumnType_STRING, Description: "A full URL where the attachment image file can be downloaded"}, {Name: "photo_deleted", Type: proto.ColumnType_STRING, Description: "If true, the attachment has been deleted"}, {Name: "photo_file_name", Type: proto.ColumnType_STRING, Description: "The name of the image file"}, {Name: "photo_id", Type: proto.ColumnType_INT, Description: "Automatically assigned when created"}, {Name: "photo_inline", Type: proto.ColumnType_BOOL, Description: "If true, the attachment is excluded from the attachment list and the attachment's URL can be referenced within the comment of a ticket. Default is false"}, {Name: "photo_size", Type: proto.ColumnType_INT, Description: "The size of the image file in bytes"}, {Name: "photo_thumbnails", Type: proto.ColumnType_JSON, Description: "An array of attachment objects. Note that photo thumbnails do not have thumbnails"}, {Name: "report_csv", Type: proto.ColumnType_BOOL, Description: "Whether or not the user can access the CSV report on the Search tab of the Reporting page in the Support admin interface."}, {Name: "restricted_agent", Type: proto.ColumnType_BOOL, Description: "If the agent has any restrictions; false for admins and unrestricted agents, true for other agents"}, {Name: "role", Type: proto.ColumnType_STRING, Description: "The user's role. Possible values are \"end-user\", \"agent\", or \"admin\""}, {Name: "role_type", Type: proto.ColumnType_INT, Description: "The user's role id. 0 for custom agents, 1 for light agent, 2 for chat agent, and 3 for chat agent added to the Support account as a contributor (Chat Phase 4)"}, {Name: "shared", Type: proto.ColumnType_BOOL, Description: "If the user is shared from a different Zendesk Support instance. Ticket sharing accounts only"}, {Name: "shared_agent", Type: proto.ColumnType_BOOL, Description: "If the user is a shared agent from a different Zendesk Support instance. Ticket sharing accounts only"}, {Name: "shared_phone_number", Type: proto.ColumnType_BOOL, Description: "Whether the phone number is shared or not."}, {Name: "signature", Type: proto.ColumnType_STRING, Description: "The user's signature. Only agents and admins can have signatures"}, {Name: "suspended", Type: proto.ColumnType_BOOL, Description: "If the agent is suspended. Tickets from suspended users are also suspended, and these users cannot sign in to the end user portal"}, {Name: "tags", Type: proto.ColumnType_JSON, Description: "The user's tags. Only present if your account has user tagging enabled"}, {Name: "ticket_restriction", Type: proto.ColumnType_STRING, Description: "Specifies which tickets the user has access to. Possible values are: \"organization\", \"groups\", \"assigned\", \"requested\", null"}, {Name: "timezone", Type: proto.ColumnType_STRING, Description: "The user's time zone."}, {Name: "two_factor_auth_enabled", Type: proto.ColumnType_BOOL, Description: "If two factor authentication is enabled"}, {Name: "updated_at", Type: proto.ColumnType_TIMESTAMP, Description: "The time the user was last updated"}, {Name: "url", Type: proto.ColumnType_STRING, Description: "The user's API url"}, {Name: "user_fields", Type: proto.ColumnType_JSON, Description: "Values of custom fields in the user's profile."}, {Name: "verified", Type: proto.ColumnType_BOOL, Description: "Any of the user's identities is verified."}, }, } } func listUser(ctx context.Context, d *plugin.QueryData, _ *plugin.HydrateData) (interface{}, error) { conn, err := connect(ctx) if err != nil { return nil, err } opts := &zendesk.UserListOptions{ PageOptions: zendesk.PageOptions{ Page: 1, PerPage: 100, }, } for true { users, page, err := conn.GetUsers(ctx, opts) if err != nil { return nil, err } for _, t := range users { d.StreamListItem(ctx, t) } if !page.HasNext() { break } opts.Page++ } return nil, nil } func getUser(ctx context.Context, d *plugin.QueryData, h *plugin.HydrateData) (interface{}, error) { conn, err := connect(ctx) if err != nil { return nil, err } quals := d.KeyColumnQuals plugin.Logger(ctx).Warn("getUser", "quals", quals) id := quals["id"].GetInt64Value() plugin.Logger(ctx).Warn("getUser", "id", id) result, err := conn.GetUser(ctx, id) if err != nil { return nil, err } return result, nil } ``` ### Dynamic Tables In the plugin definition, if `SchemaMode` is set to `dynamic`, every time Steampipe starts, the plugin's schema will be checked for any changes since the last time it loaded, and re-import the schema if it detects any. Dynamic tables are useful when you are building a plugin whose schema is not known at compile time; instead, its schema will be generated at runtime. For instance, a plugin with dynamic tables is useful if you want to load CSV files as tables from one or more directories. Each of these CSV files may have different column structures, resulting in a different structure for each table. In order to create a dynamic table, in the plugin definition, `TableMapFunc` should call a function that returns `map[string]*plugin.Table`. For instance, in the [CSV plugin](https://hub.steampipe.io/plugins/turbot/csv): ```go func Plugin(ctx context.Context) *plugin.Plugin { p := &plugin.Plugin{ Name: "steampipe-plugin-csv", ConnectionConfigSchema: &plugin.ConnectionConfigSchema{ NewInstance: ConfigInstance, Schema: ConfigSchema, }, DefaultTransform: transform.FromGo().NullIfZero(), SchemaMode: plugin.SchemaModeDynamic, TableMapFunc: PluginTables, } return p } func PluginTables(ctx context.Context, p *plugin.Plugin) (map[string]*plugin.Table, error) { // Initialize tables tables := map[string]*plugin.Table{} // Search for CSV files to create as tables paths, err := csvList(ctx, p) if err != nil { return nil, err } for _, i := range paths { tableCtx := context.WithValue(ctx, "path", i) base := filepath.Base(i) // tableCSV returns a *plugin.Table type tables[base[0:len(base)-len(filepath.Ext(base))]] = tableCSV(tableCtx, p) } return tables, nil } ``` The `tableCSV` function mentioned in the example above looks for all CSV files in the configured paths, and for each one, builds a `*plugin.Table` type: ```go func tableCSV(ctx context.Context, p *plugin.Plugin) *plugin.Table { path := ctx.Value("path").(string) csvFile, err := os.Open(path) if err != nil { plugin.Logger(ctx).Error("Could not open CSV file", "path", path) panic(err) } r := csv.NewReader(csvFile) csvConfig := GetConfig(p.Connection) if csvConfig.Separator != nil && *csvConfig.Separator != "" { r.Comma = rune((*csvConfig.Separator)[0]) } if csvConfig.Comment != nil { if *csvConfig.Comment == "" { // Disable comments r.Comment = 0 } else { // Set the comment character r.Comment = rune((*csvConfig.Comment)[0]) } } // Read the header to peak at the column names header, err := r.Read() if err != nil { plugin.Logger(ctx).Error("Error parsing CSV header:", "path", path, "header", header, "err", err) panic(err) } cols := []*plugin.Column{} for idx, i := range header { cols = append(cols, &plugin.Column{Name: i, Type: proto.ColumnType_STRING, Transform: transform.FromField(i), Description: fmt.Sprintf("Field %d.", idx)}) } return &plugin.Table{ Name: path, Description: fmt.Sprintf("CSV file at %s", path), List: &plugin.ListConfig{ Hydrate: listCSVWithPath(path), }, Columns: cols, } } ``` The end result is when using the CSV plugin, whenever Steampipe starts, it will check for any new, deleted, and modified CSV files in the configured `paths` and create any discovered CSVs as tables. The CSV filenames are turned directly into table names. For more information on how the CSV plugin can be queried as a result of being a dynamic table, please see the [{csv_filename}](https://hub.steampipe.io/plugins/turbot/csv/tables/%7Bcsv_filename%7D) table document. ## Client-Side Rate Limiting The Steampipe Plugin SDK supports a [client-side rate limiting implementation](/docs/guides/limiter) to allow users to define [plugin `limiter` blocks](/docs/reference/config-files/plugin#limiter) to control concurrency and rate limiting. Support for limiters is built in to the SDK and basic functionality requires no changes to the plugin code; Just including the SDK will enable users to create limiters for your plugin using the built in `connection`, `table`, and `function_name` scopes. You can add additional flexibility by adding [function tags](#function-tags) and by [accounting for paging in List calls](#accounting-for-paged-list-calls). ## Function Tags Hydrate function tags provide useful diagnostic metadata, and they can also be used as scopes in rate limiters. Rate limiting requirements vary by plugin because the underlying APIs that they access implement rate limiting differently. Tags provide a way for a plugin author to scope rate limiters in a way that aligns with the API implementation. Tags can be added to a ListConfig, GetConfig, or HydrateConfig. ```go //// TABLE DEFINITION func tableAwsSnsTopic(_ context.Context) *plugin.Table { return &plugin.Table{ Name: "aws_sns_topic", Description: "AWS SNS Topic", Get: &plugin.GetConfig{ KeyColumns: plugin.SingleColumn("topic_arn"), IgnoreConfig: &plugin.IgnoreConfig{ ShouldIgnoreErrorFunc: shouldIgnoreErrors([]string{"NotFound", "InvalidParameter"}), }, Hydrate: getTopicAttributes, Tags: map[string]string{"service": "sns", "action": "GetTopicAttributes"}, }, List: &plugin.ListConfig{ Hydrate: listAwsSnsTopics, Tags: map[string]string{"service": "sns", "action": "ListTopics"}, }, HydrateConfig: []plugin.HydrateConfig{ { Func: listTagsForSnsTopic, Tags: map[string]string{"service": "sns", "action": "ListTagsForResource"}, }, { Func: getTopicAttributes, Tags: map[string]string{"service": "sns", "action": "GetTopicAttributes"}, }, }, ... ``` Once the tags are added to the plugin, you can use them in the `scope` and `where` arguments for your rate limiter. ```hcl plugin "aws" { limiter "sns_get_topic_attributes_us_east_1" { bucket_size = 3000 fill_rate = 3000 scope = ["connection", "region", "service", "action"] where = "action = 'GetTopicAttributes' and service = 'sns' and region = 'us-east-1' " } } ``` ## Accounting for Paged List Calls The Steampipe plugin SDK transparently handles most of the details around waiting for limiters. List calls, however, usually iterate through pages of results, and each call to fetch a page must wait for any limiters that are defined. The SDK provides a hook, `WaitForListRateLimit`, which should be called before paging to apply rate limiting to the list call: ```go // List call for paginator.HasMorePages() { // apply rate limiting d.WaitForListRateLimit(ctx) output, err := paginator.NextPage(ctx) if err != nil { plugin.Logger(ctx).Error("List error", "api_error", err) return nil, err } for _, items := range output.Items { d.StreamListItem(ctx, items) // Context can be cancelled due to manual cancellation or the limit has been hit if d.RowsRemaining(ctx) == 0 { return nil, nil } } } ``` --- ## Logging A logger is passed to the plugin via the context. You can use the logger to write messages to the log at standard log levels: ```go logger := plugin.Logger(ctx) logger.Info("Log message and a variable", myVariable) ``` The plugin logs do not currently get written to the console, but are written to the plugin logs at `~/.steampipe/logs/plugin-YYYY-MM-DD.log`, e.g., `~/.steampipe/logs/plugin-2022-01-01.log`. Steampipe uses the hclog package, which uses standard log levels (`TRACE`, `DEBUG`, `INFO`, `WARN`, `ERROR`). By default, the log level is `WARN`. You set it using the `STEAMPIPE_LOG_LEVEL` environment variable: ```bash export STEAMPIPE_LOG_LEVEL=TRACE ``` --- ## Installing and Testing Your Plugin A plugin binary can be installed manually, and this is often convenient when developing the plugin. Steampipe will attempt to load any plugin that is referred to in a `connection` configuration: - The plugin binary file must have a `.plugin` extension - The plugin binary must reside in a subdirectory of the `~/.steampipe/plugins/local/` directory and must be the ONLY `.plugin` file in that subdirectory - The `connection` must specify the path (relative to `~/.steampipe/plugins/`) to the plugin in the `plugin` argument For example, consider a `myplugin` plugin that you have developed. To install it: - Create a subdirectory `.steampipe/plugins/local/myplugin` - Name your plugin binary `myplugin.plugin`, and copy it to `.steampipe/plugins/local/myplugin/myplugin.plugin` - Create a `~/.steampipe/config/myplugin.spc` config file containing a connection definition that points to your plugin: ```hcl connection "myplugin" { plugin = "local/myplugin" } ``` - Your connection will be loaded the next time Steampipe runs. If Steampipe is running service mode, you must restart it to load the connection. --- --- --- title: Writing Your First Table sidebar_label: Writing Your First Table --- # Writing Your First Table The Steampipe Plugin SDK makes writing tables fast, easy, and fun! This guide will walk you through building the AWS plugin locally, testing a minor change, and then how to start creating a new table. ## Prerequisites - Install Golang - Install Steampipe ## Clone the Repository 1. Clone the Steampipe Plugin AWS repository: ```bash git clone https://github.com/turbot/steampipe-plugin-aws.git cd steampipe-plugin-aws ``` ## Build and Run Locally 1. Copy the default `config/aws.spc` into `~/.steampipe/config`. If not using the default AWS profile, please see AWS plugin for more information on connection configuration. ```bash cp config/aws.spc ~/.steampipe/config/aws.spc ``` 2. Run `make` to build the plugin locally and install the new version to your `~/.steampipe/plugins` directory: ```bash make ``` 3. Launch the Steampipe query shell: ```bash steampipe query ``` 4. Test basic functionality: ```sql .inspect aws select name, region from aws_s3_bucket; ``` ## Make Your First Change 1. Edit the `aws/table_aws_s3_bucket.go` table file. 2. Locate the definition for the `name` column: ```go { Name: "name", Description: "The user friendly name of the bucket.", Type: proto.ColumnType_STRING, }, ``` 3. Copy the code above and create a duplicate column `name_test`: ```go { Name: "name_test", Description: "Testing new column.", Type: proto.ColumnType_STRING, Transform: transform.FromField("Name"), }, ``` 4. Save your changes in `aws/table_aws_s3_bucket.go`. 5. Run `make` to re-build the plugin: ```bash make ``` 6. Launch the Steampipe query shell: ```bash steampipe query ``` 7. Test your changes by inspecting and querying the new column: ```sql .inspect aws_s3_bucket select name, name_test, region from aws_s3_bucket; ``` 8. The `name` and `name_test` columns should have the same data in them for each bucket. 9. Undo your changes in `aws/table_aws_s3_bucket.go` once done testing: ```bash git restore aws/table_aws_s3_bucket.go make ``` ## Create a New Table 1. Create a new file in `aws/`, copying an existing table and following the table naming standards in Steampipe Table & Column Standards: ```bash cp aws/table_aws_s3_bucket.go aws/table_aws_new_table.go ``` 2. Check if the AWS service has a service connection function in `aws/service.go` already; if not, add a new function in `aws/service.go`. 3. Add an entry for the new table into the `TableMap` in `aws/plugin.go`. For more information on this file, please see Writing Plugins - plugin.go. 5. Update the code in your new table so the table returns the correct information for its AWS resource. 4. Add a document for the table in `docs/tables/` following the Table Documentation Standards. ## References - Steampipe Table & Column Standards - Table Documentation Standards - Writing Example Queries - Coding Standards --- --- title: Dynamic Tables sidebar_label: Dynamic Tables --- ## Dynamic Tables In the plugin definition, if `SchemaMode` is set to `dynamic`, every time Steampipe starts, the plugin's schema will be checked for any changes since the last time it loaded, and re-import the schema if it detects any. Dynamic tables are useful when you are building a plugin whose schema is not known at compile time; instead, its schema will be generated at runtime. For instance, a plugin with dynamic tables is useful if you want to load CSV files as tables from one or more directories. Each of these CSV files may have different column structures, resulting in a different structure for each table. In order to create a dynamic table, in the plugin definition, `TableMapFunc` should call a function that returns `map[string]*plugin.Table`. For instance, in the [CSV plugin](https://hub.steampipe.io/plugins/turbot/csv): ```go func Plugin(ctx context.Context) *plugin.Plugin { p := &plugin.Plugin{ Name: "steampipe-plugin-csv", ConnectionConfigSchema: &plugin.ConnectionConfigSchema{ NewInstance: ConfigInstance, Schema: ConfigSchema, }, DefaultTransform: transform.FromGo().NullIfZero(), SchemaMode: plugin.SchemaModeDynamic, TableMapFunc: PluginTables, } return p } func PluginTables(ctx context.Context, p *plugin.Plugin) (map[string]*plugin.Table, error) { // Initialize tables tables := map[string]*plugin.Table{} // Search for CSV files to create as tables paths, err := csvList(ctx, p) if err != nil { return nil, err } for _, i := range paths { tableCtx := context.WithValue(ctx, "path", i) base := filepath.Base(i) // tableCSV returns a *plugin.Table type tables[base[0:len(base)-len(filepath.Ext(base))]] = tableCSV(tableCtx, p) } return tables, nil } ``` The `tableCSV` function mentioned in the example above looks for all CSV files in the configured paths, and for each one, builds a `*plugin.Table` type: ```go func tableCSV(ctx context.Context, p *plugin.Plugin) *plugin.Table { path := ctx.Value("path").(string) csvFile, err := os.Open(path) if err != nil { plugin.Logger(ctx).Error("Could not open CSV file", "path", path) panic(err) } r := csv.NewReader(csvFile) csvConfig := GetConfig(p.Connection) if csvConfig.Separator != nil && *csvConfig.Separator != "" { r.Comma = rune((*csvConfig.Separator)[0]) } if csvConfig.Comment != nil { if *csvConfig.Comment == "" { // Disable comments r.Comment = 0 } else { // Set the comment character r.Comment = rune((*csvConfig.Comment)[0]) } } // Read the header to peak at the column names header, err := r.Read() if err != nil { plugin.Logger(ctx).Error("Error parsing CSV header:", "path", path, "header", header, "err", err) panic(err) } cols := []*plugin.Column{} for idx, i := range header { cols = append(cols, &plugin.Column{Name: i, Type: proto.ColumnType_STRING, Transform: transform.FromField(i), Description: fmt.Sprintf("Field %d.", idx)}) } return &plugin.Table{ Name: path, Description: fmt.Sprintf("CSV file at %s", path), List: &plugin.ListConfig{ Hydrate: listCSVWithPath(path), }, Columns: cols, } } ``` The end result is when using the CSV plugin, whenever Steampipe starts, it will check for any new, deleted, and modified CSV files in the configured `paths` and create any discovered CSVs as tables. The CSV filenames are turned directly into table names. For more information on how the CSV plugin can be queried as a result of being a dynamic table, please see the [{csv_filename}](https://hub.steampipe.io/plugins/turbot/csv/tables/%7Bcsv_filename%7D) table document. --- --- title: Hydrate Functions sidebar_label: Hydrate Functions --- # Hydrate Functions A hydrate function connects to an external system or service and gathers data to fill a database table. `List` and `Get` and are hydrate functions, defined in the [Table Definition](/docs/develop/writing_plugins/the-basics#table-definition), that have these special characteristics: - Every table ***must*** define a `List` and/or `Get` function. - The `List` or `Get` will always be called before any other hydrate function in the table, as the other functions typically depend on the result of the `Get` or `List` call. - Whether `List` or `Get` is called depends upon whether the qualifiers (in `where` clauses and `join...on`) match the `KeyColumns` defined in the [Get Config](/docs/develop/writing_plugins/the-basics#get-config). This enables Steampipe to fetch only the "row" data that it needs. - Typically, hydrate functions return a single data item (data for a single row). *List functions are an exception* — they stream data for multiple rows using the [QueryData](https://github.com/turbot/steampipe-plugin-sdk/blob/HEAD/plugin/query_data.go) object, and return `nil`. - The `Get` function will usually get the key column data from the `QueryData.KeyColumnQuals` so that it can get the appropriate item as based on the qualifiers (`where` clause, `join...on`). If the `Get` hydrate function is used as both a `Get` function AND a normal hydrate function, you should get the key column data from the `HydrateData.Item` if it is not nil, and use the `QueryData.KeyColumnQuals` otherwise. ## About List Functions A `List` function retrieves all the items of a particular resource type from an API. For example, the [zendesk_group](https://hub.steampipe.io/plugins/turbot/zendesk/tables/zendesk_group) table supports the query: ```sql select * from zendesk_group ``` The function `tableZenDeskGroup` [defines the table](https://github.com/turbot/steampipe-plugin-zendesk/blob/33c9cb30826c41d75c7d07d1947e2fd9fd5735d1/zendesk/table_zendesk_group.go#L10-L30). ```go package zendesk import ( "context" "github.com/turbot/steampipe-plugin-sdk/v5/grpc/proto" "github.com/turbot/steampipe-plugin-sdk/v5/plugin" ) func tableZendeskGroup() *plugin.Table { return &plugin.Table{ Name: "zendesk_group", Description: "When support requests arrive in Zendesk Support, they can be assigned to a Group. Groups serve as the core element of ticket workflow; support agents are organized into Groups and tickets can be assigned to a Group only, or to an assigned agent within a Group. A ticket can never be assigned to an agent without also being assigned to a Group.", List: &plugin.ListConfig{ Hydrate: listGroup, }, Get: &plugin.GetConfig{ KeyColumns: plugin.SingleColumn("id"), Hydrate: getGroup, }, Columns: []*plugin.Column{ {Name: "id", Type: proto.ColumnType_INT, Description: "Unique identifier for the group"}, {Name: "url", Type: proto.ColumnType_STRING, Description: "API url of the group"}, {Name: "name", Type: proto.ColumnType_STRING, Description: "Name of the group"}, {Name: "deleted", Type: proto.ColumnType_BOOL, Description: "True if the group has been deleted"}, {Name: "created_at", Type: proto.ColumnType_TIMESTAMP, Description: "The time the group was created"}, {Name: "updated_at", Type: proto.ColumnType_TIMESTAMP, Description: "The time of the last update of the group"}, }, } } ``` The table's `List` property refers, by way of the `Hydrate` property, to a Steampipe function that lists Zendesk groups, [listGroup](https://github.com/turbot/steampipe-plugin-zendesk/blob/33c9cb30826c41d75c7d07d1947e2fd9fd5735d1/zendesk/table_zendesk_group.go#L32-L46). That function calls the GitHub Go SDK's [GetGroups](https://github.com/nukosuke/go-zendesk/blob/cfe7c2f3969555054ea51b90b2a60a219e309a43/zendesk/group.go#L44-L70) and returns an array of [Group](https://github.com/nukosuke/go-zendesk/blob/cfe7c2f3969555054ea51b90b2a60a219e309a43/zendesk/group.go#L12-L21). ```go type Group struct { ID int64 `json:"id,omitempty"` URL string `json:"url,omitempty"` Name string `json:"name"` Default bool `json:"default,omitempty"` Deleted bool `json:"deleted,omitempty"` Description string `json:"description,omitempty"` CreatedAt time.Time `json:"created_at,omitempty"` UpdatedAt time.Time `json:"updated_at,omitempty"` } ``` A Steampipe `List` function is one of two special forms of [hydrate function](/docs/develop/writing-plugins#hydrate-functions) — `Get` is the other — that take precedence over other [hydrate functions](/docs/develop/writing_plugins/hydrate-functions). ## About Get Functions A `Get` function fetches a single item by its key. While it's possible to define a table that only uses `Get`, the common pattern combines `List` to retrieve basic data and `Get` to enrich it. Here's the [Get function](https://github.com/turbot/steampipe-plugin-zendesk/blob/33c9cb30826c41d75c7d07d1947e2fd9fd5735d1/zendesk/table_zendesk_group.go#L48-L60) for a Zendesk group. ```go func getGroup(ctx context.Context, d *plugin.QueryData, h *plugin.HydrateData) (interface{}, error) { conn, err := connect(ctx, d) if err != nil { return nil, err } quals := d.EqualsQuals id := quals["id"].GetInt64Value() result, err := conn.GetGroup(ctx, id) if err != nil { return nil, err } return result, nil } ```` ### Observing List versus Get When `List` and `Get` are both defined, you can use [diagnostic mode](https://steampipe.io/docs/guides/limiter#exploring--troubleshooting-with-diagnostic-mode) to see which function Steampipe calls for a given query. ``` STEAMPIPE_DIAGNOSTIC_LEVEL=all steampipe service start ``` This query uses `List`. ``` > select jsonb_pretty(_ctx) as _ctx from zendesk_group limit 1 +--------------------------------------------------+ | _ctx | +--------------------------------------------------+ | { | | "steampipe": { | | "sdk_version": "5.8.0" | | }, | | "diagnostics": { | | "calls": [ | | { | | "type": "list", | | "scope_values": { | | "table": "zendesk_group", | | "connection": "zendesk", | | "function_name": "listGroup" | | }, | | "function_name": "listGroup", | | "rate_limiters": [ | | ], | | "rate_limiter_delay_ms": 0 | | } | | ] | | }, | | "connection_name": "zendesk" | | } | +--------------------------------------------------+ ``` This query uses `Get`. ``` > select jsonb_pretty(_ctx) as _ctx from zendesk_group where id = '24885656597005' +--------------------------------------------------+ | _ctx | +--------------------------------------------------+ | { | | "steampipe": { | | "sdk_version": "5.8.0" | | }, | | "diagnostics": { | | "calls": [ | | { | | "type": "list", | | "scope_values": { | | "table": "zendesk_group", | | "connection": "zendesk", | | "function_name": "getGroup" | | }, | | "function_name": "getGroup", | | "rate_limiters": [ | | ], | | "rate_limiter_delay_ms": 0 | | } | | ] | | }, | | "connection_name": "zendesk" | | } | +--------------------------------------------------+ ``` This works because `id` is one of the `KeyColumns` in the `Get` property of the table definition. ```go Get: &plugin.GetConfig{ KeyColumns: plugin.SingleColumn("id"), Hydrate: getGroup, }, ``` That enables the [Steampipe plugin SDK](https://github.com/turbot/steampipe-plugin-sdk) to choose the more optimal `getGroup` function when the `id` is known. ### List or Get in Combination with Hydrate In addition to the special `List` and `Get` hydrate functions, there's a class of general hydrate functions that enrich what's returned by `List` or `Get`. The Zendesk plugin doesn't use any of these, but in `table_aws_cloudtrail_trail.go`, [getCloudTrailStatus](https://github.com/turbot/steampipe-plugin-aws/blob/40058d8fd15a677214cfa3e22de35cde707775e7/aws/table_aws_cloudtrail_trail.go#L329-L369) is an example of this kind of function. Steampipe knows it's a `HydrateFunc` because the column definition [declares](https://github.com/turbot/steampipe-plugin-aws/blob/40058d8fd15a677214cfa3e22de35cde707775e7/aws/table_aws_cloudtrail_trail.go#L149-L155) it using the `Hydrate` property. ```go { Name: "latest_cloudwatch_logs_delivery_error", Description: "Displays any CloudWatch Logs error that CloudTrail encountered when attempting to deliver logs to CloudWatch Logs.", Type: proto.ColumnType_STRING, Hydrate: getCloudtrailTrailStatus, Transform: transform.FromField("LatestCloudWatchLogsDeliveryError"), }, ``` ## HydrateConfig Use `HydrateConfig` in a table definition to provide granular control over the behavior of a hydrate function. Things you can control with a `HydrateConfig`: - Errors to ignore. - Errors to retry. - Max concurrent calls to allow. - Hydrate dependencies - Rate-limiter tags For a `Get` or `List`, you can specify errors to ignore and/or retry using `DefaultIgnoreConfig` and `DefaultRetryConfig` as seen here in [the Fastly plugin](https://github.com/turbot/steampipe-plugin-fastly/blob/550922bae7bc066e12ddd7634d96c9dd33374eed/fastly/plugin.go#L20-L22). ```go func Plugin(ctx context.Context) *plugin.Plugin { p := &plugin.Plugin{ Name: "steampipe-plugin-fastly", ConnectionConfigSchema: &plugin.ConnectionConfigSchema{ NewInstance: ConfigInstance, }, DefaultTransform: transform.FromGo().NullIfZero(), DefaultIgnoreConfig: &plugin.IgnoreConfig{ ShouldIgnoreErrorFunc: shouldIgnoreErrors([]string{"404"}), }, DefaultRetryConfig: &plugin.RetryConfig{ ShouldRetryErrorFunc: shouldRetryError([]string{"429"}), }, TableMap: map[string]*plugin.Table{ "fastly_acl": tableFastlyACL(ctx), ... "fastly_token": tableFastlyToken(ctx), }, } return p } ``` For other hydrate functions, you do this with `HydrateConfig`. Here's how the `oci_identity_tenancy` table [configures error handling](https://github.com/turbot/steampipe-plugin-oci/blob/4403adee869853b3d205e8d93681af0859870701/oci/table_oci_identity_tenancy.go#L23-28) for the `getRetentionPeriod` function. ```go HydrateConfig: []plugin.HydrateConfig{ { Func: getRetentionPeriod, ShouldIgnoreError: isNotFoundError([]string{"404"}), }, }, ``` You can similarly use `ShouldRetryError` along with a corresponding function that returns true if, for example, an API call its a rate limit. ```go func shouldRetryError(err error) bool { if cloudflareErr, ok := err.(*cloudflare.APIRequestError); ok { return cloudflareErr.ClientRateLimited() } return false } ``` You can likewise use `MaxConcurrency` to limit the number of calls to a hydrate function. In practice, the granular controls afforded by `ShouldIgnoreError`, `ShouldRetryError`, and `MaxConcurrency` are not much used at the level of individual hydrate functions. Plugins are likelier to assert such control globally. But the flexibility is threre if you need it. Two features of `HydrateConfig` that are used quite a bit are `Depends` and `Tags`. Use `Depends` to make a function depend on one or more others. In `aws_s3_bucket`, the function [getBucketLocation](https://github.com/turbot/steampipe-plugin-aws/blob/66bd381dfaccd3d16ccedba660cd05adaa17c7d7/aws/table_aws_s3_bucket.go#L399-L440) returns the client region that's needed by all the other functions, so they all [depend on it](https://github.com/turbot/steampipe-plugin-aws/blob/66bd381dfaccd3d16ccedba660cd05adaa17c7d7/aws/table_aws_s3_bucket.go#L27-L102). ```go HydrateConfig: []plugin.HydrateConfig{ { Func: getBucketLocation, Tags: map[string]string{"service": "s3", "action": "GetBucketLocation"}, }, { Func: getBucketIsPublic, Depends: []plugin.HydrateFunc{getBucketLocation}, Tags: map[string]string{"service": "s3", "action": "GetBucketPolicyStatus"}, }, { Func: getBucketVersioning, Depends: []plugin.HydrateFunc{getBucketLocation}, Tags: map[string]string{"service": "s3", "action": "GetBucketVersioning"}, }, ``` Use `Tags` to expose a hydrate function to control by a limiter. In AWS plugin's, `aws_config_rule` table, the `HydrateConfig` specifies [additional hydrate functions](https://github.com/turbot/steampipe-plugin-aws/blob/66bd381dfaccd3d16ccedba660cd05adaa17c7d7/aws/table_aws_config_rule.go#L40-L49) that fetch tags and compliance details for each config rule. ```go HydrateConfig:plugin.HydrateConfig{ { Func: getConfigRuleTags, Tags: map[string]string{"service": "config", "action": "ListTagsForResource"}, }, { Func: getComplianceByConfigRules, Tags: map[string]string{"service": "config", "action": "DescribeComplianceByConfigRule"}, }, }, ``` In this example the `Func` property names `getConfigRuleTags` and `getComplianceByConfigRules` as additional hydrate functions that fetch tags and compliance details for each config rule, respectively. The `Tags` property enables a rate limiter to [target these functions](https://steampipe.io/docs/guides/limiter#function-tags). (See also [function-tags](#function-tags) below.) ## Memoize: Caching hydrate results The [Memoize](https://github.com/judell/steampipe-plugin-sdk/blob/HEAD/plugin/hydrate_cache.go#L61-L139) function can be used to cache the results of a `HydrateFunc`. In the [multi_region.go](https://github.com/turbot/steampipe-plugin-aws/blob/main/aws/multi_region.go) file of `steampipe-plugin-aws` repository, the `listRegionsForServiceCacheKey` function is used to create a custom cache key for the `listRegionsForService` function. This cache key includes the service ID, which is unique for each AWS service. Here's a simplified version of the code: ```go func listRegionsForServiceCacheKey(ctx context.Context, d *plugin.QueryData, h *plugin.HydrateData) (interface{}, error) { serviceID := h.Item.(string) key := fmt.Sprintf("listRegionsForService-%s", serviceID) return key, nil } var listRegionsForService = plugin.HydrateFunc(listRegionsForServiceUncached).Memoize(memoize.WithCacheKeyFunction(listRegionsForServiceCacheKey)) ``` In this example, `Memoize` caches the results of `listRegionsForServiceUncached`. The `WithCacheKeyFunction` option specifies a custom function (`listRegionsForServiceCacheKey`) to generate the cache key. This function takes the service ID from the hydrate data and includes it in the cache key, ensuring a unique cache key for each AWS service. This is a common pattern when using `Memoize`: you define a `HydrateFunc` and then wrap it with `Memoize` to enable caching. You can also use the `WithCacheKeyFunction` option to specify a custom function that generates the cache key, which is especially useful when you need to include additional context in the cache key. Additional functions can be chained after a `From` function to transform the data: | Name | Description |-|- | `Transform` | Apply an arbitrary transform to the data (specified by 'transformFunc'). | `TransformP` | Apply an arbitrary transform to the data, passing a parameter. | `NullIfEqual` | If the input value equals the transform param, return nil. | `NullIfZero` | If the input value equals the zero value of its type, return nil. --- --- title: Implementing Tables sidebar_label: Implementing Tables --- # Implementing Tables By convention, each table should be implemented in a separate file named `table_{table name}.go`. Each table will have a single table definition function that returns a pointer to a `plugin.Table` (this is the function specified in the `TableMap` of the [plugin definition](/docs/develop/writing_plugins/the-basics#plugin-definition)). The function name is typically the table name in camel case (per golang standards) prefixed by `table`. The table definition specifies the name and description of the table, a list of column definitions, and the functions to call in order to list the data for all the rows, or to get data for a single row. When a connection is created, Steampipe uses the table and column definitions to create the Postgres foreign tables, however the tables don't store the data — the data is populated (hydrated) when a query is run. The basic flow is: 1. A user runs a Steampipe query against the database 1. Postgres parses the query and sends the parsed request to the Steampipe FDW. 1. The Steampipe Foreign Data Wrapper ([Steampipe FDW](https://github.com/turbot/steampipe-postgres-fdw)) determines what tables and columns are required. 1. The FDW calls the appropriate [Hydrate Functions](/docs/develop/writing_plugins/hydrate-functions) in the plugin, which fetch the appropriate data from the API, cloud provider, etc. - Each table defines two special hydrate functions, `List` and `Get`. The `List` or `Get` will always be called before any other hydrate function in the table, as the other functions typically depend on the result of the Get or List call. - Whether `List` or `Get` is called depends upon whether the qualifiers (in `where` clauses and `join...on`) match the `KeyColumns`. This allows Steampipe to fetch only the "row" data that it needs. Qualifiers (aka quals) enable Steampipe to map a Postgres constraint (e.g. `where created_at > date('2023-01-01')`) to the API parameter (e.g. `since=1673992596000`) that the plugin's supporting SDK uses to fetch results matching the Postgres constraint. (See [Translating SQL Operators to API Calls](/docs/develop/writing_plugins/hydrate-functions#translating-sql-operators-to-api-calls).) - Multiple columns may (and usually do) get built from the same hydrate function, but Steampipe only calls the hydrate functions for the columns requested (specified in the `select`, `join`, or `where`). This enabless Steampipe to call only those APIs for the "column" data requested in the query. 1. The [Transform Functions](/docs/develop/writing_plugins/transform-functions) are called for each column. The transform functions extract and/or reformat data returned by the hydrate functions into the format to be returned in the column. 1. The plugin returns the transformed data to the Steampipe FDW 1. Steampipe FDW returns the results to the database --- --- title: Writing Plugins sidebar_label: Writing Plugins --- # Writing Plugins The Steampipe Plugin SDK makes writing tables fast, easy, and fun! Most of the heavy lifting is taken care of for you — just define your tables and columns, wire up a few API calls, and you can start to query your service with standard SQL! While this documentation will provide an introduction and some examples, note that Steampipe is an evolving, open source project. Please refer to the code as the authoritative source, as well as for real-world examples. Also, please try to be a good community citizen. Following the standards makes for a better, more consistent experience for end-users and developers alike. --- --- title: SQL Operators as API Filters sidebar_label: SQL Operators as API Filters --- # SQL Operators as API Filters When you write SQL that resolves to API calls, you want a SQL operator like `>` to influence an API call in the expected way. Consider this query: ``` SELECT * FROM github_issue WHERE updated_at > '2022-01-01' ``` You would like the underlying API call to filter accordingly. In order to intercept the SQL operator, and implement it in your table code, you [declare it](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_issue.go#L75-L93) in the `KeyColumns` property of the table. ``` KeyColumns: []*plugin.KeyColumn{ { Name: "repository_full_name", Require: plugin.Required, }, { Name: "author_login", Require: plugin.Optional, }, { Name: "state", Require: plugin.Optional, }, { Name: "updated_at", Require: plugin.Optional, Operators: []string{">", ">="}, // declare operators your get/list/hydrate function handles }, ``` Then, in your table code, you write a handler for the column. The handler configures the API to [filter on one or more operators](https://github.com/turbot/steampipe-plugin-github/blob/ec932825c781a66c325fdbc5560f96cac272e64f/github/table_github_issue.go#L135-L147). ``` if d.Quals["updated_at"] != nil { for _, q := range d.Quals["updated_at"].Quals { givenTime := q.Value.GetTimestampValue().AsTime() // timestamp from the SQL query afterTime := givenTime.Add(time.Second * 1) // one second after the given time switch q.Operator { case ">": filters.Since = githubv4.NewDateTime(githubv4.DateTime{Time: afterTime}) // handle WHERE updated_at > '2022-01-01' case ">=": filters.Since = githubv4.NewDateTime(githubv4.DateTime{Time: givenTime}) // handle WHERE updated_at >= '2022-01-01' } } } ``` --- --- title: The Basics sidebar_label: The Basics --- ## main.go The `main` function in then `main.go` is the entry point for your plugin. This function must call `plugin.Serve` from the plugin sdk to instantiate your plugin gRPC server. You will pass the plugin function that you will create in the [plugin.go](#plugingo) file. ```go package main import ( "github.com/turbot/steampipe-plugin-sdk/v5/plugin" "github.com/turbot/steampipe-plugin-zendesk/zendesk" ) func main() { plugin.Serve(&plugin.ServeOpts{PluginFunc: zendesk.Plugin}) } ``` ## plugin.go The `plugin.go` file should implement a single [Plugin Definition](#plugin-definition) (`Plugin()` function) that returns a pointer to a `Plugin` to be loaded by the gRPC server. By convention, the package name for your plugin should be the same name as your plugin, and go files for your plugin (except `main.go`) should reside in a folder with the same name. ## Plugin Definition | Argument | Description |-|- | `Name` | The name of the plugin (`steampipe-plugin-{plugin name}`). | `TableMap` | A map of table names to [Table definitions](#table-definition). | `DefaultTransform` | A default [Transform Function](/docs/develop/writing_plugins/transform-functions) to be used when one is not specified. While not required, this may save quite a bit of repeated code. | `DefaultGetConfig` | Provides an optional mechanism for providing plugin-level defaults to a get config. This is merged with the GetConfig defined in the table and/or columns. Typically, this is used to standardize error handling with `ShouldIgnoreError`. | `SchemaMode` | Specifies if the schema should be checked and re-imported if changed every time Steampipe starts. This can be set to `dynamic` or `static`. Defaults to `static`. | `RequiredColumns` | An optional list of columns that ALL tables in this plugin MUST implement. ### Example Plugin Definition Here's the definition of the [Zendesk](https://github.com/turbot/steampipe-plugin-zendesk/blob/33c9cb30826c41d75c7d07d1947e2fd9fd5735d1/zendesk/plugin.go#L10-L29) plugin. ```go package zendesk import ( "context" "github.com/turbot/steampipe-plugin-sdk/v5/plugin" "github.com/turbot/steampipe-plugin-sdk/v5/plugin/transform" ) func Plugin(ctx context.Context) *plugin.Plugin { p := &plugin.Plugin{ Name: "steampipe-plugin-zendesk", DefaultTransform: transform.FromGo().NullIfZero(), TableMap: map[string]*plugin.Table{ "zendesk_brand": tableZendeskBrand(), "zendesk_group": tableZendeskGroup(), "zendesk_organization": tableZendeskOrganization(), "zendesk_search": tableZendeskSearch(), "zendesk_ticket": tableZendeskTicket(), "zendesk_ticket_audit": tableZendeskTicketAudit(), "zendesk_trigger": tableZendeskTrigger(), "zendesk_user": tableZendeskUser(), }, } return p } ``` ## Table Definition The `plugin.Table` may specify: | Argument | Description |-|- | `Name` | The name of the table. | `Description` | A short description, added as a comment on the table and used in help commands and documentation. | `Columns` | An array of [column definitions](#column-definition). | `List` | A [List Config](#list-config) definition, used to fetch the data items used to build all rows of a table. | `Get` | A [Get Config](#get-config) definition, used to fetch a single item. | `DefaultTransform` | A default [transform function](/docs/develop/writing_plugins/transform-functions) to be used when one is not specified. If set, this will override the default set in the plugin definition. ### Example Table Definition Here's how the [zendesk_user](https://github.com/turbot/steampipe-plugin-zendesk/blob/33c9cb30826c41d75c7d07d1947e2fd9fd5735d1/zendesk/table_zendesk_user.go#L12-L70) table is defined. ```go func tableZendeskUser() *plugin.Table { return &plugin.Table{ Name: "zendesk_user", Description: "Zendesk Support has three types of users: end users (your customers), agents, and administrators.", List: &plugin.ListConfig{ Hydrate: listUser, }, Get: &plugin.GetConfig{ KeyColumns: plugin.SingleColumn("id"), Hydrate: getUser, }, Columns: []*plugin.Column{ {Name: "active", Type: proto.ColumnType_BOOL, Description: "False if the user has been deleted"}, {Name: "alias", Type: proto.ColumnType_STRING, Description: "An alias displayed to end users"}, {Name: "chat_only", Type: proto.ColumnType_BOOL, Description: "Whether or not the user is a chat-only agent"}, {Name: "created_at", Type: proto.ColumnType_TIMESTAMP, Description: "The time the user was created"}, {Name: "custom_role_id", Type: proto.ColumnType_INT, Description: "A custom role if the user is an agent on the Enterprise plan"}, {Name: "default_group_id", Type: proto.ColumnType_INT, Description: "The id of the user's default group"}, {Name: "details", Type: proto.ColumnType_STRING, Description: "Any details you want to store about the user, such as an address"}, {Name: "email", Type: proto.ColumnType_STRING, Description: "The user's primary email address. *Writeable on create only. On update, a secondary email is added."}, {Name: "external_id", Type: proto.ColumnType_STRING, Description: "A unique identifier from another system. The API treats the id as case insensitive. Example: \"ian1\" and \"Ian1\" are the same user"}, {Name: "id", Type: proto.ColumnType_INT, Description: "Automatically assigned when the user is created"}, {Name: "last_login_at", Type: proto.ColumnType_TIMESTAMP, Description: "The last time the user signed in to Zendesk Support"}, {Name: "locale", Type: proto.ColumnType_STRING, Description: "The user's locale. A BCP-47 compliant tag for the locale. If both \"locale\" and \"locale_id\" are present on create or update, \"locale_id\" is ignored and only \"locale\" is used."}, {Name: "locale_id", Type: proto.ColumnType_INT, Description: "The user's language identifier"}, {Name: "moderator", Type: proto.ColumnType_BOOL, Description: "Designates whether the user has forum moderation capabilities"}, {Name: "name", Type: proto.ColumnType_STRING, Description: "The user's name"}, {Name: "notes", Type: proto.ColumnType_STRING, Description: "Any notes you want to store about the user"}, {Name: "only_private_comments", Type: proto.ColumnType_BOOL, Description: "true if the user can only create private comments"}, {Name: "organization_id", Type: proto.ColumnType_INT, Description: "The id of the user's organization. If the user has more than one organization memberships, the id of the user's default organization"}, {Name: "phone", Type: proto.ColumnType_STRING, Description: "The user's primary phone number."}, {Name: "photo_content_type", Type: proto.ColumnType_STRING, Description: "The content type of the image. Example value: \"image/png\""}, {Name: "photo_content_url", Type: proto.ColumnType_STRING, Description: "A full URL where the attachment image file can be downloaded"}, {Name: "photo_deleted", Type: proto.ColumnType_STRING, Description: "If true, the attachment has been deleted"}, {Name: "photo_file_name", Type: proto.ColumnType_STRING, Description: "The name of the image file"}, {Name: "photo_id", Type: proto.ColumnType_INT, Description: "Automatically assigned when created"}, {Name: "photo_inline", Type: proto.ColumnType_BOOL, Description: "If true, the attachment is excluded from the attachment list and the attachment's URL can be referenced within the comment of a ticket. Default is false"}, {Name: "photo_size", Type: proto.ColumnType_INT, Description: "The size of the image file in bytes"}, {Name: "photo_thumbnails", Type: proto.ColumnType_JSON, Description: "An array of attachment objects. Note that photo thumbnails do not have thumbnails"}, {Name: "report_csv", Type: proto.ColumnType_BOOL, Description: "Whether or not the user can access the CSV report on the Search tab of the Reporting page in the Support admin interface."}, {Name: "restricted_agent", Type: proto.ColumnType_BOOL, Description: "If the agent has any restrictions; false for admins and unrestricted agents, true for other agents"}, {Name: "role", Type: proto.ColumnType_STRING, Description: "The user's role. Possible values are \"end-user\", \"agent\", or \"admin\""}, {Name: "role_type", Type: proto.ColumnType_INT, Description: "The user's role id. 0 for custom agents, 1 for light agent, 2 for chat agent, and 3 for chat agent added to the Support account as a contributor (Chat Phase 4)"}, {Name: "shared", Type: proto.ColumnType_BOOL, Description: "If the user is shared from a different Zendesk Support instance. Ticket sharing accounts only"}, {Name: "shared_agent", Type: proto.ColumnType_BOOL, Description: "If the user is a shared agent from a different Zendesk Support instance. Ticket sharing accounts only"}, {Name: "shared_phone_number", Type: proto.ColumnType_BOOL, Description: "Whether the phone number is shared or not."}, {Name: "signature", Type: proto.ColumnType_STRING, Description: "The user's signature. Only agents and admins can have signatures"}, {Name: "suspended", Type: proto.ColumnType_BOOL, Description: "If the agent is suspended. Tickets from suspended users are also suspended, and these users cannot sign in to the end user portal"}, {Name: "tags", Type: proto.ColumnType_JSON, Description: "The user's tags. Only present if your account has user tagging enabled"}, {Name: "ticket_restriction", Type: proto.ColumnType_STRING, Description: "Specifies which tickets the user has access to. Possible values are: \"organization\", \"groups\", \"assigned\", \"requested\", null"}, {Name: "timezone", Type: proto.ColumnType_STRING, Description: "The user's time zone."}, {Name: "two_factor_auth_enabled", Type: proto.ColumnType_BOOL, Description: "If two factor authentication is enabled"}, {Name: "updated_at", Type: proto.ColumnType_TIMESTAMP, Description: "The time the user was last updated"}, {Name: "url", Type: proto.ColumnType_STRING, Description: "The user's API url"}, {Name: "user_fields", Type: proto.ColumnType_JSON, Description: "Values of custom fields in the user's profile."}, {Name: "verified", Type: proto.ColumnType_BOOL, Description: "Any of the user's identities is verified."}, }, } } ``` ## List Config A ListConfig definition defines how to list all rows of a table. | Argument | Description |-|- | `KeyColumns` | An optional list of columns that require a qualifier in order to list data for this table. | `Hydrate` | A [hydrate function](/docs/develop/writing_plugins/hydrate-functions) which is called first when performing a 'list' call. | `ParentHydrate` | An optional parent list function - if you list items with a parent-child relationship, this will list the parent items. ## Get Config A GetConfig definition defines how to get a single row of a table. | Argument | Description |-|- | `KeyColumns` | A list of keys which are used to uniquely identify rows - used to determine whether a query is a 'get' call. | `ItemFromKey [DEPRECATED]`] | This property is deprecated. | `Hydrate` | A [hydrate function](/docs/develop/writing_plugins/hydrate-functions) which is called first when performing a 'get' call. If this returns 'not found', no further hydrate functions are called. | `ShouldIgnoreError` | A function which will return whether to ignore a given error. ## Column Definition A column definition definition specifies the name and description of the column, its data type, and the functions to call to hydrate the column (if the list call does not) and transform it (if the default transformation is not sufficient). | Argument | Description |-|- | `Name` | The column name. | `Type` | The [data type](#column-data-types) for this column. | `Description` | The column description, added as a comment and used in help commands and documentation. | `Hydrate` | You can explicitly specify the [hydrate function](/docs/develop/writing_plugins/hydrate-functions) function to populate this column. This is only needed if neither the default hydrate functions nor the `List` function return data for this column. | `Default` | An optional default column value. | `Transform` | An optional chain of [transform functions](/docs/develop/writing_plugins/transform-functions) to generate the column value. ## Column Data Types Currently supported data types are: | Name | Type |-|- | `ColumnType_BOOL` | Boolean | `ColumnType_INT` | Integer | `ColumnType_DOUBLE` | Double precision floating point | `ColumnType_STRING` | String | `ColumnType_JSON` | JSON | `ColumnType_DATETIME` | Date/Time (Deprecated - use ColumnType_TIMESTAMP) | `ColumnType_TIMESTAMP` | Date/Time | `ColumnType_IPADDR` | IP Address | `ColumnType_CIDR` | IP network CIDR | `ColumnType_UNKNOWN` | Unknown | `ColumnType_INET` | Either an IP Address or an IP network CIDR | `ColumnType_LTREE` | [Ltree](https://www.postgresql.org/docs/current/ltree.html) --- --- title: Transform Functions sidebar_label: Transform functions --- # Transform Functions Transform functions are used to extract and/or reformat data returned by a hydrate function into the desired type/format for a column. You can call your own transform function with `From`, but you probably don't need to write one -- The SDK provides many that cover the most common cases. You can chain transforms together, but the transform chain must be started with a `From` function: | Name | Description |-|- | `FromConstant` | Return a constant value (specified by 'param'). | `FromField` | Generate a value by retrieving a field from the source item. | `FromValue` | Generate a value by returning the raw hydrate item. | `FromCamel` | Generate a value by converting the given field name to camel case and retrieving from the source item. | `FromGo` | Generate a value by converting the given field name to camel case and retrieving from the source item. | `From` | Generate a value by calling a 'transformFunc'. | `FromJSONTag` | Generate a value by finding a struct property with the json tag matching the column name. | `FromTag` | Generate a value by finding a struct property with the tag 'tagName' matching the column name. | `FromP` | Generate a value by calling 'transformFunc' passing param. Additional functions can be chained after a `From` function to transform the data: | Name | Description |-|- | `Transform` | Apply an arbitrary transform to the data (specified by 'transformFunc'). | `TransformP` | Apply an arbitrary transform to the data, passing a parameter. | `NullIfEqual` | If the input value equals the transform param, return nil. | `NullIfZero` | If the input value equals the zero value of its type, return nil. --- --- title: Distributions sidebar_label: Distributions --- # Distributions Steampipe provides zero-ETL tools for fetching data directly from APIs and services. Steampipe is offered in several distributions: - The **Steampipe CLI** exposes APIs and services as a high-performance relational database, enabling you to write SQL-based queries to explore dynamic data. The Steampipe CLI is a turnkey solution that includes its own PostgreSQL database including plugin management. - **[Steampipe Postgres FDWs](/docs/steampipe_postgres/overview)** are native Postgres Foreign Data Wrappers that translate APIs to foreign tables. Unlike Steampipe CLI, which ships with its own Postgres server instance, the Steampipe Postgres FDWs can be installed in any supported Postgres database version. - **[Steampipe SQLite Extensions](/docs/steampipe_sqlite/overview)** provide SQLite virtual tables that translate your queries into API calls, transparently fetching information from your API or service as you request it. - **[Steampipe Export CLIs](/docs/steampipe_export/overview)** provide a flexible mechanism for exporting information from cloud services and APIs. Each exporter is a stand-alone binary that allows you to extract data using Steampipe plugins *without a database*. --- --- title: FAQ sidebar_label: FAQ --- # Steampipe FAQ ## Topics | Topic | |-----------------------------------------------------------------| | [Basics and Functionality](#basics-and-functionality) | | [Performance and Scalability](#performance-and-scalability) | | [Plugins and Customization](#plugins-and-customization) | | [Deployment](#deployment) | | [Support and Lifecycle](#support-and-lifecycle) | | [Troubleshooting and Debugging](#troubleshooting-and-debugging) | | [Supported Linux Distributions](#supported-linux-distributions) | ------ ## Basics and Functionality ### What kinds of data sources can Steampipe query? Steampipe's extensible [plugin](/docs/managing/plugins) model allows it so support a wide range of source data, including: - Cloud providers like [AWS](https://hub.steampipe.io/plugins/turbot/aws), [Azure](https://hub.steampipe.io/plugins/turbot/azure), [GCP](https://hub.steampipe.io/plugins/turbot/gcp), [Cloudflare](https://hub.steampipe.io/plugins/turbot/cloudflare), [Alibaba Cloud](https://hub.steampipe.io/plugins/turbot/alicloud), [IBM Cloud](https://hub.steampipe.io/plugins/turbot/ibm), and [Oracle Cloud](https://hub.steampipe.io/plugins/turbot/oci). - Cloud-based services like[GitHub](https://hub.steampipe.io/plugins/turbot/github), [Zoom](https://hub.steampipe.io/plugins/turbot/zoom), [Okta](https://hub.steampipe.io/plugins/turbot/okta), [Slack](https://hub.steampipe.io/plugins/turbot/slack), [Salesforce](https://hub.steampipe.io/plugins/turbot/salesforce), and [ServiceNow](https://hub.steampipe.io/plugins/turbot/servicenow). - Structured files like [CSV](https://hub.steampipe.io/plugins/turbot/csv), [YML](https://hub.steampipe.io/plugins/turbot/config) and [Terraform](https://hub.steampipe.io/plugins/turbot/terraform). - Ad hoc investigation of [network services](https://hub.steampipe.io/plugins/turbot/net) like DNS & HTTP. You can even run [arbitrary commands](https://hub.steampipe.io/plugins/turbot/exec) on local or remote systems and query the output. Find published plugins in the [Steampipe Hub](https://hub.steampipe.io/)! ### Does Steampipe store query results locally? No. Plugins make API calls, results flow into Postgres as ephemeral tables that are only cached for 5 minutes (by default). Steampipe optimizes for live data, and stores nothing by default. ### Can I use `psql`, `pgadmin`, or another client with Steampipe? Yes. Steampipe exposes a [Postgres endpoint](/docs/query/third-party) that any Postgres client can connect to. When you start the Steampipe [service](/docs/managing/service), Steampipe will print the connection string to the console. You can also run `steampipe service status` to see the connection string. ### Can I export query results as CSV, JSON, etc? Yes. You can run [steampipe query](/docs/reference/cli/query) with the `--output` argument to capture results in CSV or JSON format: ```bash steampipe query --output json "select * from aws_account" ``` ### Does Steampipe work with WSL (Windows Subsystem for Linux)? Yes, with WSL 2.0. ### Does Steampipe support SQL write operations? No. Steampipe is optimized for read-only query. However, it works closely with [Flowpipe](https://flowpipe.io) which can run Steampipe queries and act on the results. ### How do I know what tables and columns are available to query? In the Steampipe CLI you can use [.inspect](/docs/reference/dot-commands/inspect) to list tables by plugin name, e.g. `.inspect aws` to produce a selectable list of tables. When you select one, e.g. `.inspect aws_s3_bucket` you'll see the schema for the table. You can see the same information on the [Steampipe Hub](https://hub.steampipe.io/plugins/turbot/aws/tables/aws_s3_bucket#inspect). ### Can I query more than one AWS account / Azure subscription / GCP project? Yes. You can create an [aggregator](/docs/managing/connections#using-aggregators). This works for multiple connections of the same type, including AWS/Azure/GCP as well as all other plugins. It's common practice to use a script to generate the `.spc` files for an organization. See, for example, [https://github.com/happy240/steampipe-conn-generator-for-aws-organization](https://github.com/happy240/steampipe-conn-generator-for-aws-organization). Turbot Pipes provides [integrations](https://turbot.com/pipes/docs/integrations) to simplify the setup and keep configurations up to date by automatically maintaining connections for [AWS organizations](https://turbot.com/pipes/docs/integrations/aws), [Azure tenants](https://turbot.com/pipes/docs/integrations/azure), [GCP organizations](https://turbot.com/pipes/docs/integrations/gcp) and [GitHub organizations](https://turbot.com/pipes/docs/integrations/github) ## Performance and Scalability ### How well does Steampipe perform when querying multiple connections? The large variance in customer environments and configurations make it impossible to provide specific estimates, but many users have scaled Steampipe to hundreds of connections. Multiple connections are queried in parallel, subject to plugin-specific [rate-limiting mechanisms](/docs/guides/limiter). Recent data is served from [cache](https://steampipe-dwhhps9u9-turbot.vercel.app/docs/guides/caching). Connection-level qualifiers, like AWS account_id, can reduce the number of connections queried. Writing good queries makes a significant difference in performance: - Select only the columns that you need, to avoid making API calls to hydrate data that you don't require. - Limit results with a `where` clause on [key columns](https://steampipe-dwhhps9u9-turbot.vercel.app/docs/guides/key-columns) when possible to allow Steampipe to do server-side row-level filtering. ### How does Steampipe handle rate-limiting? Generally, Plugins are responsible for handling rate limiting because the details are service specific. Plugins should typically recognize when they are being rate limited and backoff and retry either using their native Go SDK, the basic rate-limiting provided by the [plugin SDK](https://github.com/turbot/steampipe-plugin-sdk), or adding [limiters](/docs/guides/limiter) compiled in the plugin . You can also define your own custom [limiters](/docs/guides/limiter) in [configuration files](/docs/reference/config-files/overview). ### How can I control the amount of memory used by Steampipe and plugins? To set a set soft memory limit for the Steampipe process, use the `STEAMPIPE_MEMORY_MAX_MB` environment variable. For example, to set a 2GB limit: `export STEAMPIPE_MEMORY_MAX_MB=2048`. Each plugin runs as its own process, and can have its own memory limit set in its configuration file using the memory_max_mb attribute. For example: ```hcl plugin "aws" { memory_max_mb = 2048 } ``` Alternatively, you can set a default memory limit for all plugin processes using the `STEAMPIPE_PLUGIN_MEMORY_MAX_MB` environment variable. For example, to set a 2GB limit: ```bash export STEAMPIPE_PLUGIN_MEMORY_MAX_MB=2048 ``` ## Plugins and Customization ### Can plugin X have a table for Y? If the plugin lacks a table you need, file a feature request (GitHub issue) for a new table in the applicable plugin repo, e.g. `github.com/turbot/steampipe-plugin-{pluginName}/issues`. Of course we welcome contributions! The following [guide](/docs/develop/writing-your-first-table) shows you how to write your first table. ### Does Steampipe have a plugin for X? If you have an idea for a new plugin, file a [feature request](https://github.com/turbot/steampipe/issues/) (GitHub issue) with the label 'plugin suggestions'. We welcome code contributions as well. If you want to write a plugin, our [guide](/docs/develop/writing_plugins/overview) will help you get started. ### How can I dynamically create Steampipe connections? All connections are specified in [~/.steampipe/config/*.spc](/docs/reference/config-files/overview) files. Steampipe watches those files and reacts to changes, so if you build those files dynamically you can create connections dynamically. ### Can I create and use regular Postgres tables? Yes. Each Steampipe plugin defines its own foreign-table schema, but you can create native Postgres tables and views in the public schema. ### Can I use Steampipe plugins with my own database? Yes. Most plugins support native [Postgres FDWs](/docs/steampipe_postgres/overview) and [SQLite Extensions](/docs/steampipe_sqlite/overview). Find the details for a plugin in its Steampipe Hub documentation, e.g. the AWS plugin [for Postgres](https://hub.steampipe.io/plugins/turbot/github#postgres-fdw) and for [SQLite](https://hub.steampipe.io/plugins/turbot/github#https://hub.steampipe.io/plugins/turbot/github#sqlite-extension). ## Deployment ### Can I run Steampipe in a CI/CD pipeline or cloud shell? Yes, it's easy to install and use Steampipe in any [CI/CD pipeline or cloud shell](/docs/integrations/overview). ### Where is the Dockerfile or container example? Steampipe can be run in a containerized setup. We run it ourselves that way as part of [Turbot Pipes](https://turbot.com/pipes). However, we don't publish or support a container definition because: * The CLI is optimized for developer use on the command line. * Everyone has specific goals and requirements for their containers. * Container setup requires various mounts and access to configuration files. * It's hard to support containers across many different environments. We welcome users to create and share open-source container definitions for Steampipe. ## Troubleshooting and Debugging ### My query resulted in an error stating that is `missing 1 required qual`. What does that mean? The error indicates that you must add a `where =` (or `join...on`) clause for the specified column to your query. The Steampipe database doesn't store data, it makes API calls to get data dynamically. There are times when listing ALL the elements represented by a table is impossible or prohibitively slow. In such cases, a table may [require you to specify an equals qualifier](/docs/sql/tips#some-tables-require-a-where-or-join-clause) in a `where` or `join` clause. ### How can I know what API calls a plugin makes? Steampipe plugins are open source, and you can inspect the code to see what calls it is making. Some plugins (like the AWS plugin) provide information about the APIs being called using [function tags](/docs/guides/limiter#function-tags) that can be inspected by running Steampipe in [diagnostic mode](/docs/guides/limiter#exploring--troubleshooting-with-diagnostic-mode). ### Can I disable the Steampipe cache? Yes. [Caching](/docs/guides/caching) significantly improves query performance and reduces API calls to external systems. It is enabled by default, but you can disable it either for the server or for a given client session. To disable caching at the server level, you can set the cache option to false in `~/.steampipe/config/default.spc`: ```hcl options "database" { cache = false } ``` Alternatively, set the [STEAMPIPE_CACHE](/docs/reference/env-vars/steampipe_cache) environment variable to `false` before starting Steampipe. Within an interactive query session, you can disable caching for the client session with the [`.cache off` meta-command](/docs/reference/dot-commands/cache). ### A plugin isn't doing what I expect, how can I debug? Steampipe writes plugin logs to `~/steampipe/logs/plugin-YYYY-MM-DD.log`. By default, these logs are written at `warn` level. You can change the log level with the [STEAMPIPE_LOG_LEVEL](/docs/reference/env-vars/steampipe_log) environment variable: ```bash export STEAMPIPE_LOG_LEVEL=TRACE ``` If Steampipe is running, the plugins must be restarted for it to take effect: `steampipe service stop --force && steampipe service start`. ## Support and Lifecycle ### Steampipe CLI Support Policy The Steampipe CLI is committed to ensuring accessibility and stability for its users by maintaining versions of the CLI for at least *one year* from their initial release date. This practice ensures that users can access older versions of the CLI if needed, providing a safety net for compatibility issues or preferences. Please note that only the latest CLI version receives ongoing updates and patches. ### Plugin Registry Support Lifecycle The Steampipe Plugin Registry provides a public repository of installable plugins. To ensure accessibility and stability, the registry will maintain all versions of each plugin for at least *one year* from their release, and will preserve at least *one version* of each plugin. ## Supported Linux Distributions Steampipe requires glibc version 2.34 or higher. It will not function on systems with an older glibc version. Steampipe is tested on the latest versions of Linux LTS distributions. While it may work on other distributions with the required glibc version, official support and testing are limited to the following: | Distribution | Version | glibc Version | Notes | |--------------------|---------|---------------|---------------------------------------------------------| | Ubuntu LTS | 24.04 | 2.39 | | | Ubuntu | 22.04 | 2.35 | To cover Windows WSL2, which may be behind | | CentOS (Stream) | 9 | 2.34 | | | RHEL | 9 | 2.34 | | | Amazon Linux | 2023 | 2.34 | | --- --- title: Best Practices with AWS Organizations sidebar_label: Integrating AWS Organizations --- # Using Steampipe CLI with AWS Organizations There are some considerations when querying hundreds of accounts across all the regions. Steampipe has to have both a [_connection_](https://hub.steampipe.io/plugins/turbot/aws#configuration) defining the specific AWS accounts and an [AWS Credential _profile_](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html#cli-configure-files-settings) that defines how credentials for the connection are obtained. In a large or dynamic environment, you might have multiple accounts created or closed in any given week. Manually managing the profiles and connections can lead to mistakes and blindspots in your organization, so it's critical that these are kept up to date. This guide also assumes you want to query across all the regions. Why would you do that? A global company is probably going to have a global footprint. Your APAC division probably uses the Singapore and Tokyo regions. Your Italian subsidiary wants to enable eu-south-1. That Oslo acquisition you just made deployed all its infrastructure in eu-north-1. You must assume you have infrastructure in every AWS Region at a certain point. It's why [AWS tells you to enable GuardDuty](https://docs.aws.amazon.com/guardduty/latest/ug/guardduty_settingup.html#setup-before), CloudTrail, and IAM Access Analyzer in all the regions, not just the ones you think you have deployed resources into. ## Three ways to query all your AWS Accounts. This guide will offer three scenarios for accessing all of your AWS accounts using [cross-account roles](https://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_cross-account-with-roles.html). 1. Leverage local credentials to authenticate and assume the cross-account role in a single AWS Organization. 2. Leverage EC2 Instance credentials to authenticate and assume the cross-account role in a single AWS Organization. 3. Leverage EC2 Instance credentials to authenticate and assume the same cross-account role in multiple AWS Organizations. Why cross-account roles? Simply put, they are AWS best-practice for accessing multiple AWS accounts. [AWS recommends](https://docs.aws.amazon.com/accounts/latest/reference/credentials-access-keys-best-practices.html) customers leverage roles over long-term access keys. [AWS Identity Center](https://aws.amazon.com/iam/identity-center/) (formerly known as AWS Single Sign On or SSO) works for a small number of accounts, but as an end-user, you must run `aws sso login` for each account. This guide recommends implementing a [Security view-only access](https://docs.aws.amazon.com/whitepapers/latest/organizing-your-aws-environment/security-ou-and-accounts.html#security-read-only-access-account) AWS Account. All of the other accounts in your AWS Organization(s) should have a security-audit role that trusts the security account. Each example provided below will create a [Steampipe configuration file](https://hub.steampipe.io/plugins/turbot/aws#multi-account-connections) that will query _all_ your accounts by default. Each connection (i.e. account) is prefixed with `aws_`, and all the aws [connections are aggregated](https://steampipe.io/docs/using-steampipe/managing-connections#using-aggregators) via the wildcard `connections = ["aws_*"]` which is placed in the front of the [search path](https://steampipe.io/docs/managing/connections#setting-the-search-path). In each scenario, the Steampipe spc file will look like this: ```hcl # Create an aggregator of _all_ the accounts as the first entry in the search path. connection "aws" { plugin = "aws" type = "aggregator" connections = ["aws_*"] } connection "aws_fooli_sandbox" { plugin = "aws" profile = "fooli-sandbox" regions = ["*"] } connection "aws_fooli_payer" { plugin = "aws" profile = "fooli-payer" regions = ["*"] } ``` The primary difference is how the AWS config file is generated. If Steampipe is leveraging local credentials to assume the security-audit role: ``` [profile sp_fooli-sandbox] role_arn = arn:aws:iam::111111111111:role/security-audit source_profile = fooli-security role_session_name = steampipe ``` If Steampipe is leveraging the EC2 Instance Credentials to assume the security-audit role: ``` [profile sp_fooli-sandbox] role_arn = arn:aws:iam::111111111111:role/security-audit credential_source = Ec2InstanceMetadata role_session_name = steampipe ``` ----- ## How to run these scripts ### Local Authentication with a cross-account role In this scenario, you're running from a local workstation and using your existing authentication methods to the trusted security account. This could be an IAM User; temporary credentials provided by [aws-gimme-creds](https://github.com/Nike-Inc/gimme-aws-creds) or AWS SSO. All other connections will leverage a cross-account audit role. 1. You need to dedicate one account in your AWS Organization for the purposes of auditing all the other accounts (the "audit account"). You then need to deploy an IAM Role (the "security-audit role") in all AWS accounts that trusts the audit account. 2. Clone the [steampipe-samples](https://github.com/turbot/steampipe-samples) repo. ```bash git clone https://github.com/turbot/steampipe-samples.git cd steampipe-samples/all/aws-organizations-scripts ``` 3. Run the `generate_config_for_cross_account_roles.sh` script. ```bash ./generate_config_for_cross_account_roles.sh LOCAL security-audit ~/.aws/fooli-config fooli-security ``` * In the above example `LOCAL` is the method of authentication. * `security-audit` is the name of the cross-account role you have access to * `~/.aws/fooli-config` is the location of the AWS config file * `fooli-security` is the name of an _existing_ AWS profile in the audit account that can assume the `security-audit` role 4. Before adding the contents of the `~/.aws/fooli-config` file to your `~/.aws/config`, you want to make sure there are no duplicate `[profile ]` blocks in either file. **Note:** this script will not append or overwrite the default `~/.aws/config` file. While we try and prevent conflicts by prefixing all the profiles with `sp_`, you will want to reconcile what is generated with the other profiles in your `~/.aws/config` file or the aws CLI may fail to run. You can override the default `~/.aws/config` file with the [`AWS_CONFIG_FILE`](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html#cli-configure-files-where) environment variable. If you do that, you will need make sure to define the `source_profile` in the generated config file. ### EC2 Instance With an EC2 Instance running Steampipe, we can leverage the [EC2 Instance Metadata Service](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)(IMDS) to generate temporary credentials from the [Instance Profile](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html). 1. You need to dedicate one account in your AWS Organization for the purposes of auditing all the other accounts (the "audit account"). You then need to deploy an IAM Role (the "security-audit role") in all AWS accounts that trusts the audit account. 2. Deploy an EC2 Instance in the audit account, and attach an IAM Instance Profile that has a role with permission to `iam:AssumeRole` the security-audit role. 2. Clone the [steampipe-samples](https://github.com/turbot/steampipe-samples) repo. ```bash git clone https://github.com/turbot/steampipe-samples.git cd steampipe-samples/all/aws-organizations-scripts ``` 3. Run the `generate_config_for_cross_account_roles.sh` script. ```bash ./generate_config_for_cross_account_roles.sh IMDS security-audit ~/.aws/fooli-config ``` * In the above example `IMDS` is the method of authentication. * `security-audit` is the name of the Cross Account Role you have access to * `~/.aws/fooli-config` is the location of the AWS config file 4. Verify the contents of the `~/.aws/fooli-config` and copy or append it to `~/.aws/config`. Unlike the above example, you do not need to ensure there is a source_profile defined. Running the script in IMDS mode can be made idempotent. ### ECS Task With an ECS task running Steampipe, we can leverage the [ECS task role](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html) to leverage the required permissions and assume the cross account role. 1. You need to dedicate one account in your AWS Organization for the purposes of auditing all the other accounts (the "audit account"). You then need to deploy an IAM Role (the "security-audit role") in all AWS accounts that trusts the audit account. 2. Deploy an ECS task in the audit account, and attach a Task IAM role with permission to `iam:AssumeRole` the security-audit role. 2. Clone the [steampipe-samples](https://github.com/turbot/steampipe-samples) repo. ```bash git clone https://github.com/turbot/steampipe-samples.git cd steampipe-samples/all/aws-organizations-scripts ``` 3. Run the `generate_config_for_cross_account_roles.sh` script. ```bash ./generate_config_for_cross_account_roles.sh ECS security-audit ~/.aws/fooli-config ``` * In the above example `ECS` is the method of authentication. * `security-audit` is the name of the Cross Account Role you have access to * `~/.aws/fooli-config` is the location of the AWS config file 4. Verify the contents of the `~/.aws/fooli-config` and copy or append it to `~/.aws/config`. Unlike the above example, you do not need to ensure there is a source_profile defined. Running the script in IMDS mode can be made idempotent. ### Multiple AWS Organizations At some point, you will find yourself with a second AWS organization. Maybe you created a new organization to test Service Control Policies. Or you've acquired another company and can't migrate accounts until your legal department, and AWS's legal department agree to update terms or adjust spending commitments. How can you leverage the above patterns across multiple AWS Organizations? We can adjust our pattern above slightly. You'll need to ensure all the accounts in each organization have the same cross-account role (the "security-audit role") that trusts the same centralized audit account. Since we have to do an assume-role to get the account list from each organizations, this script is in python. The usage is: ```bash usage: generate_config_for_multipayer.py [-h] [--debug] [--aws-config-file AWS_CONFIG_FILE] [--steampipe-connection-file STEAMPIPE_CONNECTION_FILE] --rolename ROLENAME --payers PAYERS [PAYERS ...] [--role-session-name ROLE_SESSION_NAME] ``` 1. You need to dedicate one account in one of your AWS Organization for the purposes of auditing all the other accounts (the "central audit account"). You then need to deploy an IAM Role (the "security-audit role") in all AWS accounts in every organization that trusts the central audit account. 2. Deploy an EC2 Instance in the central audit account, and attach an IAM Instance Profile that has a role with permission to `iam:AssumeRole` the security-audit role. 2. Clone the [steampipe-samples](https://github.com/turbot/steampipe-samples) repo. ```bash git clone https://github.com/turbot/steampipe-samples.git cd steampipe-samples/all/aws-organizations-scripts ``` 3. Run the `generate_config_for_multipayer.py` script. ```bash generate_config_for_multipayer.py --aws-config-file ~/.aws/config \ --steampipe-connection-file ~/.steampipe/config/aws.spc \ --rolename security-audit \ --payers 123456789012 210987654321 \ --role-session-name steampipe ``` ## Queries with these scenarios In all three examples, the default connection is the [aggregation of all accounts](https://steampipe.io/docs/managing/connections#using-aggregators). So this SQL query will provide a list of all the instances in every account and region.: ```sql select instance_id, region, account_id, tags ->> 'Name' as name from aws_ec2_instance; ``` You can use the connection name and table to see results for a specific account. ```sql select instance_id, region, account_id, tags ->> 'Name' as name from aws_minecraft.aws_ec2_instance; ``` You can reference the connection `aws_payer` for queries to the AWS Organization service. Here we join all the aws_ec2_instance tables with the organizations_account table that's only in the payer account. ```sql select ec2.instance_id, ec2.region, ec2.account_id, org.name as account_name, ec2.tags ->> 'Name' as instance_name from aws_ec2_instance as ec2, aws_payer.aws_organizations_account as org where org.id = ec2.account_id; ``` --- --- title: Users Guide to Steampipe Caching sidebar_label: Understanding Caching --- # Users Guide to Steampipe Caching Caching is an essential part of the Steampipe experience and is enabled by default. While caching is important in any database, it is especially critical to Steampipe where data is retrieved from external APIs "on-demand". Caching not only significantly improves query performance, it also reduces API calls to external systems which helps avoid throttling and sometimes even reduces costs. Steampipe introduced caching options in one of the earliest releases (v0.2.0). Back then, Steampipe was really just a CLI tool - we didn't really differentiate between server and client. The caching options and behavior were designed when the plugin execution model was different as well; at the time, each Steampipe connection had its own OS process and its own cache and the options reflected that design. In Steampipe v0.20.0, the caching options and behavior have changed. This guide will describe how caching works in Steampipe, as well as the options and settings that you can set to modify caching behavior. ## Types of Caches There are 2 caches in Steampipe: - The **Query Cache** is used to cache query results. Plugins automatically support query caching just by using the Steampipe Plugin SDK. In general this requires no plugin-specific code, though there are cases where the plugin author may need to dictate the caching behavior for a given table. The query cache resides in the plugin process. - The **Plugin Cache** (sometimes called the **Connection Cache**) can be used by plugin authors to cache arbitrary data. The plugin cache also resides in the plugin process. The **Query Cache** is the focus of this guide. The Steampipe caching [environment variables](/docs/reference/env-vars/overview) and [configuration file options](/docs/reference/config-files/overview) are used to modify the behavior of the **query cache**, and do not affect the plugin cache. ## How it (basically) works When you issue a query, Steampipe will add the results to the query cache. If you make a subsequent query, it will be served from the cache if: - It selects the same columns or a subset of the columns that were hydrated previously; AND - The qualifiers are the same or more restrictive Some examples: - If you `select * from aws_s3_bucket` and then do `select title,arn from aws_s3_bucket`, the second query will be returned from the cache. - Similarly, if you `select instance_id from aws_ec2_instance` and then do `select instance_id, vpc_id from aws_ec2_instance` the second query will be returned from the cache. This is true in this case because the `vpc_id` column is returned by the same [hydrate function](/docs/develop/writing-plugins#hydrate-functions) as `instance_id` so even though the first query did not specifically request it, Steampipe fetched it from the API and stored it in the cache. - If you `select * from aws_s3_bucket` and then do `select * from aws_s3_bucket where title like '%vandelay%'`, the second query will be returned from the cache. In fact, the caching is actually done by the SDK on a per-table, per-connection basis so in many cases it's clever enough to use the cache even in subsequent queries that join the data. For example: 1. Run `select * from aws_lambda_function`. Steampipe fetches the data from the API and it is added to the cache 1. Run `select * from aws_vpc_subnet`. Steampipe fetches the data from the API and it is added to the cache 1. Run the following query, and it will return the data entirely from the cache: ```sql select fn.name, fn.region, count (availability_zone) as zone_count from aws_lambda_function as fn cross join jsonb_array_elements_text(vpc_subnet_ids) as vpc_subnet join aws_vpc_subnet as sub on sub.subnet_id = vpc_subnet group by fn.name, fn.region order by zone_count; ``` The implementation has a few important implications: - The cache resides in the plugin's process space which implies it is on the server where the database runs, not on the client. This means that the caching is used by any client, not just the `steampipe` CLI. Command-line tools like `psql` and `pgcli` benefit from the query cache, as do BI tools like Metabase and Tableau. - The caching is done per-connection. This means that if you query an aggregator, an equivalent query to the individual connection would be able to use the cached results, and vice-versa. - The cache is shared by ALL connected clients. If multiple users connect to the same Steampipe database, they all share the same cache. ## Query Cache Options Steampipe provides options for enabling/disabling the cache, changing the TTL, and controlling the cache size. These options can be set via config file options, environment variables, or commands in an interactive query shell session. Broadly speaking, there are two groups of settings: 1. [Server-level settings](#server-level-cache-settings) that apply to ALL connections 1. [Client-level settings](#client-level-cache-settings) that apply to a single client session ### Server-level Cache Settings The server settings dictate the actual operation of the cache on the server: - If the server has the `cache` disabled, then caching is off and data is not even written to the cache. Any client connecting will NOT be able to use the cache, regardless of their settings. - The `cache_max_ttl` is the actual maximum cache lifetime - items are invalidated/ejected from the cache after this TTL. A client can request a specific TTL, however if it exceeds the max TTL on the server, then the effective TTL will be the max TTL. - The `cache_max_size_mb` is the maximum physical size of the cache. There is no equivalent client setting. The server level settings can set in the [database options](/docs/reference/config-files/options#database-options) or by setting environment variables on the host where the database is running. ```hcl options "database" { cache = true # true, false cache_max_ttl = 900 # max expiration (TTL) in seconds cache_max_size_mb = 1024 # max total size of cache across all plugins } ``` | Argument | Default | Values | Description |-|-|-|- | `cache` | `true` | `true`, `false` | Enable or disable query caching. This can also be set via the [STEAMPIPE_CACHE](/docs/reference/env-vars/steampipe_cache) environment variable. | `cache_max_size_mb` | unlimited | an integer | The maximum total size of the query cache across all plugins. This can also be set via the [STEAMPIPE_CACHE_MAX_SIZE_MB](/docs/reference/env-vars/steampipe_cache_max_size_mb) environment variable. | `cache_max_ttl` | `300` | an integer | The maximum length of time to cache query results, in seconds. This can also be set via the [STEAMPIPE_CACHE_MAX_TTL](/docs/reference/env-vars/steampipe_cache_max_ttl) environment variable. ### Client-level Cache Settings The client settings enable you to choose how your specific client session will use the cache. Because these are client settings, they only apply when connecting with `steampipe`. Remember that the cache actually lives on the server; the client level settings allow you to specify how your client session interacts with the cache but it is subject to the server level settings: - If caching is enabled on the server, you can specify that it be disabled for your connection. This is commonly used for testing or troubleshooting. - If caching is disabled on the server, then the client option to enable is ignored and caching is disabled for *all* clients. - You can specify the `cache_ttl` for your client session. Note, however, that the client is always subject to the `max_cache_ttl` set on the server. If the `cache_ttl` is greater than the server's `max_cache_ttl`, then the `max_cache_ttl` is the effective TTL. The client-level settings can be set for each [workspace](/docs/reference/config-files/workspace) or by setting environment variables on the host from which you are connecting. ```hcl workspace "my_workspace" { cache = true # true, false cache_ttl = 300 # max expiration (TTL) in seconds } ``` | Argument | Default | Values | Description |-|-|-|- | `cache` | `true` | `true`, `false` | Enable/disable caching. Note that is a **client** setting - if the database (`options "database"`) has the cache disabled, then the cache is disabled regardless of the workspace setting. This can also be set via the [STEAMPIPE_CACHE](/docs/reference/env-vars/steampipe_cache) environment variable. | `cache_ttl` | `300`| an integer | Set the client query cache expiration (TTL) in seconds. Note that is a **client** setting - if the database `cache_max_ttl` is lower than the `cache_ttl` in the workspace, then the effective TTL for this workspace is the `cache_max_ttl`. This can also be set via the [STEAMPIPE_CACHE_TTL](/docs/reference/env-vars/steampipe_cache_ttl) environment variable. ## Client Cache Commands When running an interactive `steampipe query` session, you can use the [.cache meta-command](/docs/reference/dot-commands/cache) command to enable, disable, or clear the cache for the session. This command affects the caching behavior for this session only - it does not change the server caching options, and changes will not persist after the session ends. If caching is enabled on the server, you can disable it for your query session: ```sql .cache off ``` Subsequent queries for this session will neither be added to nor fetched from the cache. You can re-enable it for the session: ```sql .cache on ``` Note, however, that if the *server* has caching disabled, you cannot enable it. You can also clear the cache for this session: ```sql .cache clear ``` Clearing the cache does not actually remove anything from the cache, it just removes items from *your view* of the cache. This is implemented using timestamps on the cache entries. Data added to the cache is timestamped. When you do `.cache clear`, Steampipe changes the minimum timestamp for your session to the current time. When looking for items in the cache, it ignores any item with a timestamp greater (older) than the minimum for this session. You can also change the cache TTL for your session with the [.cache_ttl meta-command](/docs/reference/dot-commands/cache_ttl): ```sql .cache_ttl 60 ``` The meta-commands provide a simple interface for modifying the client query cache settings, but they only work in the Steampipe client (`steampipe query`). To allow you to perform equivalent operations from other clients (`psql`, `pgcli`, etc), we have added the `meta_cache` and `meta_cache_ttl` functions to the `steampipe_internal` schema: Clear the cache: ```sql select from steampipe_internal.meta_cache('clear') ``` Enable the cache: ```sql select from steampipe_internal.meta_cache('on') ``` Disable the cache: ```sql select from steampipe_internal.meta_cache('off') ``` Set the cache_ttl: ```sql select from steampipe_internal.meta_cache_ttl(60) ``` --- --- title: Using Key Column Qualifiers sidebar_label: Using Key Column Qualifiers --- # Using Key Column Qualifiers ## What is a Key Column? Like any relational database table, a Steampipe table is composed of one or more columns, each with a name and data type. When running Steampipe in the context of a database, you can join, filter, sort, and aggregate on any column. But unlike a conventional database table, Steampipe does not simply read data that is stored on disk. Instead, it fetches data from external sources such as APIs and cloud services. Steampipe is able to parallelize the requests to a large degree, but these requests take time and resources; every request consumes CPU, memory and network resources on both the client and the server. Steampipe hides the details from you, but even a simple query may result in hundreds of API calls. **Key Columns** enable you to optimize the data retrieval by using the capabilities of the underlying API to do row-level filtering of the results when Steampipe fetches it. Essentially, if you filter on key columns in your `where` and `join` clauses, Steampipe can do **server-side filtering**. This improves efficiency, reduces query time, and helps avoid API throttling. ## Discovering Key Columns Key columns are table-specific; they work with the capabilities of the underlying API. It's up to the plugin author to define and implement them in the plugin source code. As a user of the plugin, how do you know which columns are key columns? And how do you know which operators are supported? The easiest way is to look in the table documentation on the [Steampipe Hub](https://hub.steampipe.io/plugins). Every table will have a page in the Hub that includes a table of `schema` information. The `Operators` column indicates which key column operators are supported for the column.
Alternatively, if you are running the Steampipe CLI, you can get the key column information from the [`steampipe_plugin_column` table](#introspecting-key-columns). ### Required Key Columns There are times when listing ALL the elements represented by a table is impossible or prohibitively slow. In such cases, a table may *require* you to specify a qualifier on a key column. For example, the Github `ListUsers` API will enumerate ALL Github users. It is not reasonable to page through hundreds of thousands of users to find what you are looking for. Instead, Steampipe requires that you specify `where login =` to find the user directly, for example: ```sql select * from github_user where login = 'torvalds'; ``` Alternatively, you join on the key column (login) in a where or join clause: ```sql select u.login, o.login as organization, u.name, u.company, u.location from github_user as u, github_my_organization as o, jsonb_array_elements_text(o.member_logins) as member_login where u.login = member_login; ``` or ```sql select u.login, o.login as organization, u.name, u.company, u.location from github_my_organization as o, jsonb_array_elements_text(o.member_logins) as member_login join github_user as u on u.login = member_login; ``` The [Hub documentation](https://hub.steampipe.io/plugins) will include information about which key columns are required. If you don't pass a required qualifier, Steampipe will let you know though: ```sql > select * from github_user Error: rpc error: code = Internal desc = 'List' call for table 'github_user' is missing 1 required qual: column:'login' operator: = (SQLSTATE HV000) ``` ### Supported Operators **Not all key columns support all operators** and the [Hub documentation](https://hub.steampipe.io/plugins) will tell you which are supported for a given column. When using the Steampipe CLI, [Postgres FDWs](/docs/steampipe_postgres/overview) or [SQLite Extensions](/docs/steampipe_sqlite/overview) you can use operators in your SQL query that are not supported by the key column, but the data will be filtered on the client side after all the data has been retrieved (like any other non-key column). When using the [Export CLIs](/docs/steampipe_export/overview), however, you may only use the operators that are supported for the key column. #### Key Column Operators | Operator | Description | Abbreviation |-----------------|----------------------|------- | `=` | Equals | `=` | `<>`, `!=` | Not equal to | `ne` | `<` | Less than | `lt` | `<=` | Less than or equal to| `le` | `>` | Greater than | `gt` | `>=` | Greater than or equal to | `ge` | `~~` | Like | `~~` | `!~~` | Not Like | `!~~` | `~~*` | ILike | `~~*` | `!~~*` | Not ILike | `!~~*` | `~` | Matches regex | `~` | `!~` | Does not match regex | `!~` | `~*` | Matches iregex | `~*` | `!~*` | Does not match iregex| `!~*` | `is null` | is null | `is null` | `is not null` | is not null | `is not null` ## How it (basically) works When you run a database query, the database engine parses the query and generates one or more query plans and then selects the plan that it believes is most optimal. It then translates the query into function calls to fetch the data. After the data is fetched, the database engine may do additional filtering, formatting, and aggregation. Key columns are used in both the planning and the execution phases. In the planning phase, Steampipe assigns a lower cost to plans that filter on key columns. This serves to influence the planner to choose query plans that will leverage the key columns. In the execution phase, the database will call the appropriate [List, Get, and Hydrate functions](/docs/develop/writing-plugins#hydrate-functions) in the plugin. The plugin will then make API calls using the key columns to fetch data only for the rows it needs. After the data is fetched, the database engine will do additional filtering for the qualifiers that are not key columns, as well as any sorting, formatting, or aggregation that is required. ## Introspecting Key Columns If you are running the Steampipe CLI, you can get the key column information from the `steampipe_plugin_column` table: ```sql select name, type, (coalesce(get_config, '{}') || coalesce(list_config, '{}')) -> 'operators' as operators, coalesce((get_config || list_config) ->> 'require', 'optional') as required from steampipe_plugin_column where (coalesce(get_config, '{}') || coalesce(list_config, '{}')) -> 'operators' is not null and table_name = 'aws_vpc'; ``` ```sql +-----------------+--------+------------+----------+ | name | type | operators | required | +-----------------+--------+------------+----------+ | vpc_id | STRING | ["="] | optional | | cidr_block | CIDR | ["="] | optional | | state | STRING | ["="] | optional | | is_default | BOOL | ["=","!="] | optional | | dhcp_options_id | STRING | ["="] | optional | | owner_id | STRING | ["="] | optional | +-----------------+--------+------------+----------+ ``` --- --- title: Users Guide to Concurrency & Rate Limiting sidebar_label: Concurrency & Rate Limiting --- # Concurrency & Rate Limiting Steampipe is designed to be fast. It provides parallel execution at multiple layers: - It runs controls in parallel. - It runs queries in parallel. - For a given query it runs [List, Get, and Hydrate functions](/docs/develop/writing-plugins#hydrate-functions) in parallel. This high degree of concurrency results in low latency and high throughput, but may at times overwhelm the underlying service or API. Features like exponential back-off & retry and [caching](/docs/guides/caching) markedly improve the situation, but at large scale you may still run out of local or remote resources. The steampipe `limiter` was created to help solve these types of problems. Limiters provide a simple, flexible interface to implement client-site rate limiting and concurrency thresholds at compile time or run time. You can use limiters to: - Smooth the request rate from steampipe to reduce load on the remote API or service - Limit the number of parallel requests to reduce contention for client and network resources - Avoid hitting server limits and throttling ## Defining limiters Limiters may be defined in Go code and compiled into a plugin, or they may be defined in HCL in `.spc` configuration files. In either case, the possible settings are the same. Each limiter must have a name. In the case of an HCL definition, the label on the `limiter` block is used as the rate limiter name. For a limiter defined in Go, you must include a `Name`. A limiter may specify a `max_concurrency` which sets a ceiling on the number of [List, Get, and Hydrate functions](/docs/develop/writing-plugins#hydrate-functions) that can run in parallel. ```hcl # run up to 250 hydrate/list/get functions concurrently plugin "aws" { limiter "aws_global_concurrency" { max_concurrency = 250 } } ``` A limiter may also specify a `bucket_size` and `fill_rate` to limit the rate at which List, Get, and Hydrate functions may run. The rate limiter uses a token-bucket algorithm, where the `bucket_size` specifies the maximum number of tokens that may accrue (the burst size) and the `fill_rate` specifies how many tokens are refilled each second. ```hcl plugin "aws" { # run up to 1000 hydrate/list/get functions per second limiter "aws_global_rate_limit" { bucket_size = 1000 fill_rate = 1000 } } ``` Every limiter has a **scope**. The scope defines the context for the limit: which resources are subject to / counted against the limit. There are built-in scopes for `connection`, `table`, `function_name`, and any matrix qualifiers that the plugin may include. A plugin author may also add [function tags](#function-tags) that can also be used as scopes. If no scope is specified, then the limiter applies to all functions in the plugin. For instance, this limiter will allow 1000 hydrate/list/get functions per second *across all connections*: ```hcl plugin "aws" { # run up to 1000 hydrate/list/get functions per second across all aws connections limiter "aws_regional_rate_limit" { bucket_size = 1000 fill_rate = 1000 } } ``` If you specify a list of scopes, then *a limiter instance is created for each unique combination of scope values*. It acts much like `group by` in a SQL statement. For example, to limit to 1000 hydrate/list/get functions per second in *each region of each connection*: ```hcl plugin "aws" { # run up to 1000 hydrate/list/get functions per second in each region of each connection limiter "aws_regional_rate_limit" { bucket_size = 1000 fill_rate = 1000 scope = ["connection", "region"] } } ``` You can use a `where` clause to further filter the scopes to specific values. For example, we can restrict the limiter so that it only applies to a specific region: ```hcl plugin "aws" { # run up to 1000 hydrate/list/get functions per second in us-east-1 for each connection limiter "aws_rate_limit_us_east_1" { bucket_size = 1000 fill_rate = 1000 scope = ["connection", "region"] where = "region = 'us-east-1'" } } ``` You can define multiple limiters. If a function is included in the scope of multiple rate limiters, they will all apply. The function will wait until every rate limiter that applies to it has available bucket tokens and is below its max concurrency. ```hcl plugin "aws" { # run up to 250 functions concurrently across all connections limiter "aws_global_concurrency" { max_concurrency = 250 } # run up to 1000 functions per second in us-east-1 for each connection limiter "aws_rate_limit_us_east_1" { bucket_size = 1000 fill_rate = 1000 scope = ["connection", "region"] where = "region = 'us-east-1'" } # run up to 200 functions per second in regions OTHER than us-east-1 # for each connection limiter "aws_rate_limit_non_us_east_1" { bucket_size = 200 fill_rate = 200 scope = ["connection", "region"] where = "region <> 'us-east-1'" } } ``` ## Function Tags Hydrate function tags provide useful diagnostic metadata, and they can also be used as scopes in rate limiters. Rate limiting requirements vary by plugin because the underlying APIs that they access implement rate limiting differently. Tags provide a way for a plugin author to scope rate limiters in a way that aligns with the API implementation. Function tags must be [added in the plugin code by the plugin author](/docs/develop/writing-plugins#function-tags). Once the tags are added to the plugin, you can use them in the `scope` and `where` arguments for your rate limiter. ```hcl plugin "aws" { limiter "sns_get_topic_attributes_us_east_1" { bucket_size = 3000 fill_rate = 3000 scope = ["connection", "region", "service", "action"] where = "action = 'GetTopicAttributes' and service = 'sns' and region = 'us-east-1' " } } ``` You can view the available tags in the `scope_values` when in [diagnostic mode](#exploring--troubleshooting-with-diagnostic-mode). For example, to see the tags in the `aws_sns_topic` table: ```sql with one_row as materialized ( select * from aws_sns_topic limit 1 ) select c ->> 'function_name' as function_name, jsonb_pretty(c -> 'scope_values') as scope_values from one_row, jsonb_array_elements(_ctx -> 'diagnostics' -> 'calls') as c ``` ```sql +-------------------------------+--------------------------------------------+ | function_name | scope_values | +-------------------------------+--------------------------------------------+ | listAwsSnsTopics | { | | | "table": "aws_sns_topic", | | | "action": "ListTopics", | | | "region": "us-east-1", | | | "service": "sns", | | | "connection": "aws_dmi", | | | "function_name": "listAwsSnsTopics" | | | } | | listTagsForSnsTopic | { | | | "table": "aws_sns_topic", | | | "action": "ListTagsForResource", | | | "region": "us-east-1", | | | "service": "sns", | | | "connection": "aws_dmi", | | | "function_name": "listTagsForSnsTopic" | | | } | | listRegionsForServiceUncached | { | | | "table": "aws_sns_topic", | | | "region": "us-east-1", | | | "connection": "aws_dmi" | | | } | | getTopicAttributes | { | | | "table": "aws_sns_topic", | | | "action": "GetTopicAttributes", | | | "region": "us-east-1", | | | "service": "sns", | | | "connection": "aws_dmi", | | | "function_name": "" | | | } | +-------------------------------+--------------------------------------------+ ``` ## Exploring & Troubleshooting with Diagnostic Mode To assist in troubleshooting your rate limiter setup, Steampipe has introduced Diagnostic Mode. To enable Diagnostic Mode, set the `STEAMPIPE_DIAGNOSTIC_LEVEL` environment variable to `ALL` when you start the Steampipe DB: ```bash STEAMPIPE_DIAGNOSTIC_LEVEL=all steampipe service start ``` With diagnostics enabled, the `_ctx` column will contain information about what functions were called to fetch the row, the scope values (including any [tags](#defining-tags)) for the function, the limiters that were in effect, and the amount of time the request was delayed by the `limiters`. This diagnostic information can help you discover what scopes are available to use in limiters as well as to see the effect and impact of limiters that you have defined. ```sql select jsonb_pretty(_ctx) as _ctx ,display_name from aws_sns_topic limit 2 ``` ```sql +-----------------------------------------------------------+--------------+ | _ctx | display_name | +-----------------------------------------------------------+--------------+ | { | | | "diagnostics": { | | | "calls": [ | | | { | | | "type": "list", | | | "scope_values": { | | | "table": "aws_sns_topic", | | | "action": "ListTopics", | | | "region": "us-east-2", | | | "service": "sns", | | | "connection": "aws_dmi", | | | "function_name": "listAwsSnsTopics" | | | }, | | | "function_name": "listAwsSnsTopics", | | | "rate_limiters": [ | | | "aws_global", | | | "sns_list_topics" | | | ], | | | "rate_limiter_delay_ms": 0 | | | }, | | | { | | | "type": "hydrate", | | | "scope_values": { | | | "table": "aws_sns_topic", | | | "action": "GetTopicAttributes", | | | "region": "us-east-2", | | | "service": "sns", | | | "connection": "aws_dmi", | | | "function_name": "" | | | }, | | | "function_name": "getTopicAttributes", | | | "rate_limiters": [ | | | "sns_get_topic_attributes_150", | | | "aws_global" | | | ], | | | "rate_limiter_delay_ms": 808 | | | } | | | ] | | | }, | | | "connection_name": "aws_dmi" | | | } | | | { | | | "diagnostics": { | | | "calls": [ | | | { | | | "type": "list", | | | "scope_values": { | | | "table": "aws_sns_topic", | | | "action": "ListTopics", | | | "region": "us-east-1", | | | "service": "sns", | | | "connection": "aws_dmi", | | | "function_name": "listAwsSnsTopics" | | | }, | | | "function_name": "listAwsSnsTopics", | | | "rate_limiters": [ | | | "aws_global", | | | "sns_list_topics" | | | ], | | | "rate_limiter_delay_ms": 597 | | | }, | | | { | | | "type": "hydrate", | | | "scope_values": { | | | "table": "aws_sns_topic", | | | "action": "GetTopicAttributes", | | | "region": "us-east-1", | | | "service": "sns", | | | "connection": "aws_dmi", | | | "function_name": "" | | | }, | | | "function_name": "getTopicAttributes", | | | "rate_limiters": [ | | | "sns_get_topic_attributes_us_east_1", | | | "aws_global" | | | ], | | | "rate_limiter_delay_ms": 0 | | | } | | | ] | | | }, | | | "connection_name": "aws_dmi" | | | } | | +-----------------------------------------------------------+--------------+ ``` The diagnostics information includes information about each Get, List, and Hydrate function that was called to fetch the row, including: | Key | Description |-------------------------|---------------------- | `type` | The type of function (`list`, `get`, or `hydrate`). | `function_name` | The name of the function. | `scope_values` | A map of scope names to values. This includes the built-in scopes as well as any matrix qualifier scopes and function tags. | `rate_limiters` | A list of the rate limiters that are scoped to the function. | `rate_limiter_delay_ms` | The amount of time (in milliseconds) that Steampipe waited before calling this function due to client-side (`limiter`) rate limiting. ## Viewing and Overriding Limiters Steampipe includes the `steampipe_plugin_limiter` table to provide visibility into all the limiters that are defined in your installation, including those defined in plugin code as well as limiters defined in HCL. ```sql select name,plugin,source_type,status,bucket_size,fill_rate,max_concurrency from steampipe_plugin_limiter ``` ```sql +------------------------------------+---------------------------------------------+-------------+--------+-------------+-----------+-----------------+ | name | plugin | source_type | status | bucket_size | fill_rate | max_concurrency | +------------------------------------+---------------------------------------------+-------------+--------+-------------+-----------+-----------------+ | exec_max_concurrency_limiter | hub.steampipe.io/plugins/turbot/exec@latest | plugin | active | | | 15 | | sns_get_topic_attributes_150 | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 150 | 150 | | | sns_get_topic_attributes_30 | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 30 | 30 | | | aws_global | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 10 | 10 | | | sns_list_topics | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 30 | 30 | | | sns_list_tags_for_resource | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 10 | 10 | | | sns_get_topic_attributes_us_east_1 | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 3000 | 3000 | | | sns_get_topic_attributes_900 | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 900 | 900 | | +------------------------------------+---------------------------------------------+-------------+--------+-------------+-----------+-----------------+ ``` You can override a limiter that is compiled into a plugin by creating an HCL limiter with the same name. In the previous example, we can see that the `exec` plugin includes a default limiter named `exec_max_concurrency_limiter` that sets the max_concurrency to 15. We can override this value at run time by creating an HCL `limiter` for this plugin with the same name. The `limiter` block must be contained in a `plugin` block. Like `connection`, Steampipe will load all `plugin` blocks that it finds in any `.spc` file in the `~/.steampipe/config` directory. For example, we can add the following snippet to the `~/.steampipe/config/exec.spc` file: ```hcl plugin "exec" { limiter "exec_max_concurrency_limiter" { max_concurrency = 20 } } ``` Querying the `steampipe_plugin_limiter` table again, we can see that there are now 2 rate limiters for the `exec` plugin named `exec_max_concurrency_limiter`, but the one from the plugin is overridden by the one in the config file. ```sql select name,plugin,source_type,status,bucket_size,fill_rate,max_concurrency from steampipe_plugin_limiter ``` ```sql +------------------------------------+---------------------------------------------+-------------+--------+-------------+-----------+-----------------+ | name | plugin | source_type | status | bucket_size | fill_rate | max_concurrency | +------------------------------------+---------------------------------------------+-------------+--------+-------------+-----------+-----------------+ | exec_max_concurrency_limiter | hub.steampipe.io/plugins/turbot/exec@latest | plugin | active | | | 15 | | exec_max_concurrency_limiter | hub.steampipe.io/plugins/turbot/exec@latest | config | active | | | 20 | | aws_global | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 10 | 10 | | | sns_list_topics | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 30 | 30 | | | sns_list_tags_for_resource | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 10 | 10 | | | sns_get_topic_attributes_us_east_1 | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 3000 | 3000 | | | sns_get_topic_attributes_900 | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 900 | 900 | | | sns_get_topic_attributes_150 | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 150 | 150 | | | sns_get_topic_attributes_30 | hub.steampipe.io/plugins/turbot/aws@latest | config | active | 30 | 30 | | +------------------------------------+---------------------------------------------+-------------+--------+-------------+-----------+-----------------+ ``` ## Hints, Tips, & Best practices - You can use ANY scope in the `where`, even if it does not appear in the `scope` for the limiter. Remember that the `scope` defines the grouping; it acts similar to `group by` in SQL. Consider the following rate limiter: ```hcl plugin "aws" { limiter "aws_sns_read_rate_limit" { bucket_size = 2500 fill_rate = 2500 scope = ["connection", "region", "service", "action"] where = "service = 'sns' and (action like 'Get%' or action like 'List%') " } } ``` This will create a separate rate limiter instance for every action in the `sns` service in every region of every account - You can do 2500 `GetTopicAttributes` requests/sec in each account/region, and also 2500 `ListTagsForResource` requests/sec in each account/region, and also 2500 `ListTopics` requests/sec in each account/region. If we remove `action` from the `scope`, there will be *one* rate limiter instance for *all* actions in the `sns` service in each region/account - You can do 2500 total `GetTopicAttributes` or `ListTagsForResource` or `ListTopics` requests per second in each account/region. ```hcl plugin "aws" { limiter "aws_sns_read_rate_limit" { bucket_size = 2500 fill_rate = 2500 scope = ["connection", "region", "service"] where = "service = 'sns' and (action like 'Get%' or action like 'List%') " } } ``` - Setting `max_concurrency` at the plugin level can help prevent running out of local resources like network bandwidth, ports, file handles, etc. ```hcl plugin "aws" { max_concurrency = 250 } ``` - Optimizing rate limiters requires knowledge of how the API is implemented. If the API publishes information about what the rate limits are, and how they are applied, that's a good starting point for setting your `bucket_size` and `fill_rate` values. Getting the `limiter` values right usually involves some trial and error though, and simply setting `max_concurrency` is often good enough to get past a problem. - Use the plugin logs (`~/.steampipe/logs/plugin*.log`) to verify that the rate limiters are reducing the throttling and other errors from the API as you would expect. - Use the `steampipe_plugin_limiter` table to see what rate limiters are in effect from both the plugins and the config files, as well as which are active. Use `STEAMPIPE_DIAGNOSTIC_LEVEL=ALL` to enable extra diagnostic info in the `_ctx` to discover what scopes are available and to verify that limiters are being applied as you expect. Note that the `STEAMPIPE_DIAGNOSTIC_LEVEL` variable must be set in the database service process - if you run steampipe as a service, it must be set when you run `steampipe service start` - Throttling errors from the server, such as `429 Too Many Requests`, are not *inherently* bad. Most cloud SDKs actually account for retrying such errors and expect that it will sometimes occur. Steampipe plugins generally implement an exponential back-off & retry to account for such cases. You can use client side limiters to help avoid resource contention and to reduce throttling from the server, but completely avoiding server-side throttling is probably not necessary in most cases. --- --- title: Guides sidebar_label: Guides --- # Guides Guides provide expanded explanations for common cases. - **[Use the search_path to target specific connections (or aggregators) →](/docs/guides/search-path)** - **[Use the Steampipe CLI with AWS Organizations →](/docs/guides/aws-orgs)** - **[Understand and Configure the Query Cache →](/docs/guides/caching)** - **[Implement client-side rate limiting with `limiter` →](/docs/guides/limiter)** --- --- title: Using search_path to target connections and aggregators sidebar_label: Using search_path --- # Using search_path to target connections and aggregators You are probably here for one of the following reasons: - You can't figure out why Steampipe isn't using your [aggregator](https://steampipe.io/docs/managing/connections#querying-multiple-connections) - You want to run `steampipe query` or [Powerpipe](https://powerpipe.io/) commands against a specific connection - You want to change your default connection - You've seen references to the search path elsewhere, but you're not sure why it's important - You asked what you thought was a simple question on the Steampipe Slack, and instead of an answer they sent you this link (ugh...homework...) This guide will attempt to answer these questions in 5 minutes or less. ## Schemas in Postgres Steampipe leverages PostgreSQL foreign data wrappers to provide a SQL interface to external services and systems. The Steampipe database is an embedded PostgreSQL database. A PostgreSQL database contains one or more [schemas](https://www.postgresql.org/docs/current/ddl-schemas.html). A schema is a namespaced collection of named objects, like tables, functions, and views. Steampipe creates a Postgres schema for each Steampipe connection. In fact, if you query the Postgres information schema, you can get a list of the schemas in the database: ```sql select schema_name from information_schema.schemata order by schema_name; ``` Note that the schema names match your Steampipe connection names: ```sql .inspect ``` The schemas, in turn, contain the foreign tables that you write queries against. Again, you can see this in the information schema: ```sql select foreign_table_schema, foreign_table_name from information_schema.foreign_tables where foreign_table_schema = 'aws' ``` Or more simply, using the steampipe `.inspect` command: ```sql .inspect aws ``` In Steampipe, a [plugin](https://steampipe.io/docs/managing/plugins) defines and implements a set of related foreign tables. All connections for a given plugin will contain the same set of tables. Within a schema, table names must be unique, however the same table name can be used in different schemas. You can reference tables using a **qualified name** to disambiguate. A qualified name consists of the schema name and the object name, separated by a period. For example, to query the `aws_account` table in the `aws_prod` schema (which corresponds to the `aws_prod` connection) you can refer to it as `aws_prod.aws_account`: ```sql select * from aws_prod.aws_account ``` ## Unqualified Success Postgres also allows you to use **unqualified names**: ```sql select * from aws_account ``` Note that the `aws_account` table is specified, but the schema is not. If you have the same table name in multiple schemas, how does Postgres determine which table to use? As you probably guessed, this is where the [schema search path](https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PATH) comes in. The search path allows you to specify a list of schemas to be searched for the object. The first schema in the list that contains an object that matches the name will be used. For example, assume that that search path is set to `gcp_prod, azure_prod, aws_prod, aws_test`, and you run `select * from aws_account`. 1. Postgres will look in the `gcp_prod` schema for a table named `aws_account`, but it does not exist so it continues to the next schema in the list 2. Postgres will look in the `azure_prod` schema for a table named `aws_account`, but it does not exist so it continues to the next schema in the list 3. Postgres will look in the `aws_prod` schema for a table named `aws_account`. It finds the `aws_account` table, so it runs the query against the `aws_prod.aws_account` table. Queries in [Powerpipe Mods](https://powerpipe.io/docs/build) for Steampipe are written using **unqualified names**. This allows you to run the exact same queries, dashboards, and benchmarks against any connection, just by changing the search path! ## Setting the Search Path By default, Steampipe sets the schema search path as follows: 1. The `public` schema first. This schema is writable, and allows you to create your own objects (views, tables, functions, etc). 2. Connection schemas, in **alphabetical order** by default. 3. The `internal` schema last. This schema contains Steampipe built-in functions and other internal Steampipe objects. This schema is not displayed or managed by the Steampipe search path commands and options, but you'll see it in native SQL commands such as `show search_path`. Since the connection schemas are added to the search_path alphabetically by default, the simplest way to set the default is to rename the connections. For example, let's assume that I have 3 AWS accounts and an [aggregator](https://steampipe.io/docs/managing/connections#querying-multiple-connections), and I want the aggregator to be the first in the search path. I could name them as follows: - `aws_prod` - Production AWS account - `aws_qa` - QA AWS account - `aws_dev` - Development AWS account - `aws` - an aggregator of all 3 of the above AWS connections Steampipe will add the aggregator before the other aws connections because `aws` is first alphabetically: ``` > .search_path +-------------------------------------+ | search_path | +-------------------------------------+ | public,aws,aws_dev,aws_prod,aws_qa | +-------------------------------------+ ``` If you prefer, you can explicitly set the `search_path` in the [database options](/docs/reference/config-files/options#database-options) in your `~/.steampipe/config/default.spc` file. Note that this is somewhat brittle because every time you install or uninstall a plugin, or add or remove a connection, you will need to update the file with the new `search_path`. ## Search Path Prefix Setting the `search_path` will replace the current search path. Usually, however, you will not want to replace the entire search path, but rather *prefer* a given connection. To simplify this case, set the `search_path_prefix`. Setting the prefix will *move* the prefix to the front of the search path. You can change the search path in your interactive terminal session with the [search_path](/docs/reference/dot-commands/search_path) or [search_path_prefix](/docs/reference/dot-commands/search_path_prefix) meta-commands. This will change the search path only for the current session. You can also pass a search path or prefix to the `steampipe query` command, as well as to [Powerpipe](https://powerpipe.io/) commands (`powerpipe server`, `powerpipe bencmark run`,`powerpipe dashboard run`, `powerpipe control run` `powerpipe query run` ) to change the search path for that command. For instance, to run the CIS Benchmark against the `aws_prod` connection, you can run. ```bash powerpipe benchmark run benchmark.cis_v140 --search-path-prefix aws_prod ``` ## Tips & Tricks - Manage your default search path with a good connection-naming strategy. For most users, this means aggregator first. With AWS, for example, use the plugin name as the name of the aggregator (e.g. `aws`), and as a prefix to the other connections (e.g. `aws_prod`, `aws_dev`, etc). With this approach the aggregator always comes first, even when adding and removing connections. - Use the search path **prefix** command or argument to modify the search path when you want to prefer a connection. - When writing mods, use **unqualified** table names: - Qualified names would require you to know the connection names, which you don't know (they are defined by the user). - Users of your mod can vary the search path to target different connections - If you create custom views or other objects, make sure you keep the `public` schema in your path. - Since the `public` schema is first (by default), you can create your own tables and views to use instead of the steampipe tables. If, for example, there is a table that you want to 'permanently' cache (or only manually refresh), you can create a materialized view with the same name: `create materialized view aws_iam_credential_report as select * from aws_iam_credential_report`. ## More Information - [Setting the Search Path](https://steampipe.io/docs/managing/connections#setting-the-search-path) - [.search_path_prefix meta-command](https://steampipe.io/docs/reference/dot-commands/search_path_prefix) - [.search_path meta-command](https://steampipe.io/docs/reference/dot-commands/search_path) - [PostgreSQL Schema Search Path documentation](https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PATH) - [database options](https://steampipe.io/docs/reference/config-files/database) - [cli reference - steampipe query](https://steampipe.io/docs/reference/cli/query) - [cli reference - Powerpipe](https://powerpipe.io/docs/reference/cli) --- --- title: Using Steampipe in AWS Cloud Shell sidebar_label: AWS Cloud Shell --- # Using Steampipe in AWS Cloud Shell [AWS CloudShell](https://aws.amazon.com/cloudshell/) is a free service that spins up a terminal right in your AWS account. Because the terminal includes the AWS CLI and your credentials, it takes just a few seconds to install Steampipe itself, along with the [AWS plugin](https://hub.steampipe.io/plugins/turbot/aws). You can then immediately write SQL queries to pull data from the hundreds of Postgres tables supported by the plugin. ## About AWS Cloud Shell To start the shell, visit an URL like https://us-east-1.console.aws.amazon.com/cloudshell/home and click the highlighted icon. If you don't see the icon, switch to a [supported region](https://docs.aws.amazon.com/cloudshell/latest/userguide/supported-aws-regions.html).
Cloud Shell includes 1 GB of free persistent storage per region. When you exit the shell, AWS preserves only the files inside your home directory. So we'll install Steampipe in your home directory (vs `/usr/local/bin`), and we'll run Steampipe as `./steampipe` (vs `steampipe`). ## Installing Steampipe in AWS Cloud Shell To install Steampipe, copy and run this command. ```bash curl -s -L https://github.com/turbot/steampipe/releases/latest/download/steampipe_linux_amd64.tar.gz | tar -xzvf - ``` To install the AWS plugin, copy and run this command. ``` ./steampipe plugin install aws ``` Your output should look like: ``` aws [====================================================================] Done Installed plugin: aws@latest v0.77.0 Documentation: https://hub.steampipe.io/plugins/turbot/aws ``` ## Run your first query To launch Steampipe in query mode, do this: ```bash ./steampipe query ``` Steampipe prints a welcome message and a prompt. ``` Welcome to Steampipe v0.16.3 For more information, type .help > ``` To find all your S3 buckets, enter this query: ``` select * from aws_s3_bucket ``` Your output should look like: ``` +-------------------------------------------+--------------------------------------------------------+----------------------+-------------------------+ | name | arn | creation_date | bucket_policy_is_public | +-------------------------------------------+--------------------------------------------------------+----------------------+-------------------------+ | aws-cloudtrail-logs-605491513981-45df8af0 | arn:aws:s3:::aws-cloudtrail-logs-605491513981-45df8af0 | 2022-05-04T16:37:09Z | false | | jon-turbot-test-bucket-01 | arn:aws:s3:::jon-turbot-test-bucket-01 | 2021-10-04T16:55:29Z | false | | cf-templates-1s5tzrjxv4j52-us-west-1 | arn:aws:s3:::cf-templates-1s5tzrjxv4j52-us-west-1 | 2021-12-28T00:37:38Z | false | +-------------------------------------------+--------------------------------------------------------+----------------------+-------------------------+ ``` That's it! You didn't have to read AWS API docs, or install an API client library like `boto3`, or learn how to use that client to make API calls and unpack JSON responses. Steampipe did all that for you. It works the same way for every AWS table. And because you can use SQL to join across AWS tables, it's easy to reason over your entire AWS infrastructure. To see the full set of columns for any table, along with examples of their use, visit the [Steampipe Hub](https://hub.steampipe.io/plugins/turbot/aws/tables/). For S3 buckets, visit [aws_s3_bucket](https://hub.steampipe.io/plugins/turbot/aws/tables/aws_s3_bucket). For quick reference you can autocomplete table names directly in the shell.
If you haven't used SQL lately, see our [handy guide](https://steampipe.io/docs/sql/steampipe-sql) for writing Steampipe queries. --- --- title: Using Steampipe in Azure Cloud Shell sidebar_label: Azure Cloud Shell --- # Using Steampipe in Azure Cloud Shell [Azure Cloud Shell](https://shell.azure.com/) is a browser-based shell preloaded with tools to create and manage your Azure resources. Because the cloud shell includes the CLI and launches with your credentials, you can quickly install Steampipe along with the [Azure plugin](https://hub.steampipe.io/plugins/turbot/azure) and then instantly query your Azure resources. ## About the Azure Cloud Shell The Cloud Shell is free to all Azure users. It comes with a few [limitations](https://learn.microsoft.com/en-us/azure/cloud-shell/limitations). For example, it will use an existing resource group but must be able to create storage accounts and file shares. You may incur a cost for the file share that persists your data. Also, since you are not a user with permission to `sudo` and cannot modify files or directories outside your home directory, we will install Steampipe there and refer to it as `./steampipe`. Finally, be aware that Azure will shut down your session if inactive for 20 minutes. To start the shell, look for its icon on the top navigation bar of the Azure portal.
When you launch the shell for the first time, you will see this dialog box.
Click `Create storage` to continue. ## Installing Steampipe in Azure Cloud Shell To install Steampipe, copy and run this command. ```bash curl -s -L https://github.com/turbot/steampipe/releases/latest/download/steampipe_linux_amd64.tar.gz | tar -xzvf - ``` To install the Azure plugin, copy and run this command. ``` ./steampipe plugin install azure ``` Your output should look like this: ``` Installed plugin: azure@latest v0.31.0 Documentation: https://hub.steampipe.io/plugins/turbot/azure ``` ## Run your first query To launch Steampipe in query mode, do this: ```bash ./steampipe query ``` Steampipe prints a welcome message and a prompt. ``` Welcome to Steampipe v0.16.3 For more information, type .help > ``` Let's query the [azure_subscription](https://hub.steampipe.io/plugins/turbot/azure/tables/azure_subscription) table. ``` > select subscription_id, display_name, state, authorization_source, subscription_policies from azure_subscription; +--------------------------------------+--------------+---------+----------------------+-----------------------+ | subscription_id | display_name | state | authorization_source | subscription_policies | +--------------------------------------+--------------+---------+----------------------+-----------------------+ | 3510aexd-53Qb-496d-8f30-53x9616fc6c1 | Stacy AAA | Enabled | RoleBased | {} | +--------------------------------------+--------------+---------+----------------------+-----------------------+ ``` That's it! You didn't have to read Azure API docs, or install an [API client library](https://learn.microsoft.com/en-us/azure/data-explorer/kusto/api/client-libraries), or learn how to use that client to make API calls and unpack JSON responses. Steampipe did all that for you. It works the same way for every Azure table. And because you can use SQL to join across Azure tables, it's easy to reason over your entire Azure infrastructure. To see the full set of columns for any table, along with examples of their use, visit the [Steampipe Hub](https://hub.steampipe.io/plugins/turbot/azure/tables). For quick reference you can autocomplete table names directly in the shell.
If you haven't used SQL lately, see our [handy guide](https://steampipe.io/docs/sql/steampipe-sql) for writing Steampipe queries. --- --- title: Using Steampipe in CircleCI sidebar_label: CircleCI --- # Using Steampipe in CircleCI CircleCI provides a [hosted environment](https://circleci.com/) in which you can build, test, and deploy software. It integrates with services such as GitHub, GitLab and Bitbucket to listen to events that trigger pipelines or consume source code. Here we integrate a GitLab project with CircleCI to install Steampipe, then install a plugin and run a query. ## Installing Steampipe in CircleCI To run scripts, first connect your GitLab repository to CircleCI and create a `config.yml` file that contains the definitions of the Pipeline. Here's an example that installs Steampipe. ```yaml version: 2.1 jobs: install: machine: true steps: - checkout - run: echo "Hello, let's install Steampipe!" - run: sudo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)" workflows: my-workflow: jobs: - install ``` ## Running Steampipe in CircleCI In order to run Steampipe commands, we will first install the [Hacker News](https://hub.steampipe.io/plugins/turbot/hackernews) plugin. ```yaml version: 2.1 jobs: install: machine: true steps: - checkout - run: echo "Hello, let's install Steampipe!" - run: sudo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)" - run: 'steampipe plugin install hackernews' workflows: my-workflow: jobs: - install ```
gitlab-plugin-installed
Next, we'll update the file with a query to fetch the top 10 stories from `hackernews_best`. ```yaml version: 2.1 jobs: install: machine: true steps: - checkout - run: echo "Hello, let's install Steampipe!" - run: sudo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)" - run: 'steampipe plugin install hackernews' - run: 'steampipe query "select id, title, score from hackernews_best order by score desc limit 10"' workflows: my-workflow: jobs: - install ```
gitlab-query-output
That's it! Now you can use any of Steampipe's [plugins](https://hub.steampipe.io/plugins) to enrich your CircleCI pipelines. --- --- title: Using Steampipe in AWS Cloud9 sidebar_label: AWS Cloud9 --- # Using Steampipe in AWS Cloud9 [AWS Cloud9](https://aws.amazon.com/cloud9/) is a cloud-based IDE integrated with a code editor, debugger, and terminal that enables you to write, run, and debug your code with a browser. Steampipe seamlessly integrates to enable querying of AWS resources and creation of Steampipe dashboards. ## Installing Steampipe in AWS Cloud9 To install Steampipe, paste this command in your AWS Cloud9 terminal. ``` sudo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/scripts/install.sh)" ```
cloud9-install-steampipe
## Query AWS resources To query AWS resources using Steampipe, first install the [AWS plugin](https://hub.steampipe.io/plugins/turbot/aws) with this command. ``` steampipe plugin install aws ``` Because Cloud9 includes the AWS CLI and knows your credentials, you can immediately run SQL queries to retrieve data from hundreds of Postgres tables supported by the plugin. This query retrieves public access details for S3 buckets in your account. ```sql select region, block_public_acls, bucket_policy_is_public, ignore_public_acls, restrict_public_buckets, block_public_policy, name from aws_s3_bucket; ```
s3-public-access-preview
--- --- title: Using Steampipe in Google Cloud Shell sidebar_label: Google Cloud Shell --- # Using Steampipe in Google Cloud Shell Google Cloud Shell is a web-based environment preloaded with tools for managing Google Cloud. Because Google's cloud shell includes the CLI and launches with your credentials, you can quickly install Steampipe along with the GCP plugin and then instantly query your cloud resources. ## About the Google Cloud Shell The [Google Cloud Shell](https://cloud.google.com/shell) is free to all Google Cloud customers. Because it's a free resource, Google imposes a few [limits on the service](https://cloud.google.com/shell/docs/quotas-limits). You can only use 50 hours of Google Cloud Shell each week. Additionally, the home directory of your cloud shell is deleted if you don't use your Cloud Shell for 120 days. An inactive Cloud Shell is shut down after one hour, and an active session can run at most for 12 hours. When the Google Cloud Shell terminates, only files inside the home directory are preserved. For that reason we want to install the Steampipe binary in the local directory and not in `/usr/local/bin`. For this reason all Steampipe commands will start with `./`. To get started with the Google Cloud Shell, go to the [Google Cloud Console](https://console.cloud.google.com/). Select a Google Project that has billing enabled, then click on the Cloud Shell icon in the upper right.
Google Cloud Screenshot showing project selection and location of the Google Cloud Shell icon
## Installing Steampipe in Google Cloud Shell To install Steampipe, copy and run this command. ```bash curl -s -L https://github.com/turbot/steampipe/releases/latest/download/steampipe_linux_amd64.tar.gz | tar -xzf - ``` To install the GCP plugin, copy and run this command. ```bash ./steampipe plugin install gcp ``` Your output should look something like: ```bash Installed plugin: gcp@latest v0.27.0 Documentation: https://hub.steampipe.io/plugins/turbot/gcp ``` ## Run your first query To run a query, type: ```bash ./steampipe query ``` Let's query the [gcp_project](https://hub.steampipe.io/plugins/turbot/gcp/tables/gcp_project) table. ```sql select name, project_id, project_number, lifecycle_state, create_time from gcp_project; ``` You may find that the first time you run a query, a dialog box will prompt you to authorize Cloud Shell to use you credentials. Click "Authorize".
Screenshot of Google prompting a user to Authorize Cloud Shell
That's it! You didn't have to read [GCP API docs](https://cloud.google.com/apis/docs/overview), install an [API client library](https://cloud.google.com/python/docs/reference), or learn how to use that client to make API calls and unpack JSON responses. Steampipe did all that for you. It works the same way for every GCP table. And because you can use SQL to join across multiple tables representing GCP services, it's easy to reason over your entire GCP organization. To view the information about your [GCP Organization](https://hub.steampipe.io/plugins/turbot/gcp/tables/gcp_organization), you can run: ```sql select display_name, organization_id, lifecycle_state, creation_time from gcp_organization; ``` To see the full set of columns for any table, along with examples of their use, visit the [Steampipe Hub](https://hub.steampipe.io/plugins/turbot/gcp/tables). For quick reference you can autocomplete table names directly in the shell. If you haven't used SQL lately, see our [handy guide](https://steampipe.io/docs/sql/steampipe-sql) for writing Steampipe queries. --- --- title: Authenticate to AWS with OIDC sidebar_label: AWS + OIDC --- # Authenticate to AWS with OIDC If you run Steampipe in a [GitHub Action](https://steampipe.io/docs/integrations/github_actions/installing_steampipe) you can use GitHub Actions Secrets to store the credentials that Steampipe uses to access AWS, Azure, GCP, or another cloud API. But what if you don't want to persist credentials there? An alternative is to use OpenID Connect (OIDC) to enable an Actions workflow that acquires temporary credentials on demand. The example shown in this post uses the OIDC method in a workflow that: 1. Installs Steampipe (along with a cloud-specific plugin). 2. Runs a Steampipe query and saves the output in the repository. ## What is OIDC? [OpenID Connect 1.0](https://openid.net/specs/openid-connect-core-1_0.html) is an identity layer on top of the [OAuth 2.0 protocol](https://www.rfc-editor.org/rfc/rfc6749). It enables clients to verify the identity of the End-User based on the authentication performed by an Authorization Server, as well as to obtain basic profile information about the End-User in an interoperable and REST-like manner. ## Define the workflow First, we must create the GitHub Actions workflow file in your repository. For this [example](https://github.com/turbot/steampipe-samples/blob/main/all/github-actions-oidc/aws/steampipe-sample-aws-workflow.yml) we will use the filename `.github/workflows/steampipe.yml` ### Triggers GitHub supports a variety of event-driven triggers. Here we define two: `workflow_dispatch` to [manually run a workflow](https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow) and `schedule` to run on a cron-like schedule. To trigger the `workflow_dispatch` event, your workflow must be in the default branch. ```yaml on: workflow_dispatch: schedule: - cron: "0 4 7,14,21,28 * *" ``` ### Permissions Every time your job runs, GitHub's OIDC Provider auto-generates an OIDC token. This token contains multiple claims to establish a security-hardened and verifiable identity about the workflow that is trying to authenticate. In order to request this OIDC JWT ID token, your job or workflow run requires a permissions setting with `id-token: write`. In order to checkout to the GitHub repository and to save the query results to the repository, your job or workflow run also requires a permissions setting with `contents: write`. ```yaml permissions: id-token: write contents: write ``` ### Steps A workflow comprises one or more jobs that run in parallel, each with one or more steps that run in order. Our example defines a single job with a series of steps that authenticate to AWS, install Steampipe, run a query and save the results to the repository. First, create a step that configures the credentials Steampipe will use to access AWS. ```yaml - name: "Configure AWS credentials" id: config-aws-auth uses: aws-actions/configure-aws-credentials@v1-node16 with: role-to-assume: ${{ secrets.OIDC_AWS_ROLE_TO_ASSUME }} role-session-name: "steampipe-demo" role-duration-seconds: 900 aws-region: "us-east-1" ``` Once the cloud provider successfully validates the claims presented in the OIDC JWT ID token, it then provides a short-lived access token that is available only for the duration of the job. The short-lived access token is exported as environment variables `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` and `AWS_SESSION_TOKEN`. Steampipe will load these short-lived [credentials from environment variables](https://hub.steampipe.io/plugins/turbot/aws#credentials-from-environment-variables) to run the query. Next, you'll need to create a step that installs the Steampipe CLI and AWS plugin. ```yaml - name: "Install Steampipe CLI and AWS plugin" id: steampipe-installation uses: turbot/steampipe-action-setup@v1 with: steampipe-version: 'latest' plugin-connections: | connection "aws" { plugin = "aws" } ``` Before running the query, create a new folder on the branch specified in your GitHub repository to save the query output. In our example, we will save the output to the folder `steampipe/output/aws`. The default environment variable [GITHUB_WORKSPACE](https://docs.github.com/en/actions/learn-github-actions/variables#default-environment-variables) refers to the default working directory on the runner for steps, and the default location of your repository when using the [checkout](https://github.com/actions/checkout) action. Next, create a step that runs the query and outputs the results to CSV. ```yaml - name: "Run Steampipe query" id: steampipe-query continue-on-error: true run: | steampipe query "select instance_id, instance_state, launch_time, state_transition_time from aws_ec2_instance" > output/aws/instances.csv ``` Finally, add a step that pushes the output of the query to your repository. Update the `working-directory` to the folder created in the above step. ```yaml - name: "Commit the file to GitHub" id: push-to-gh working-directory: steampipe/output/aws run: | git config user.name github-actions git config user.email github-actions@github.com git add instances.csv git commit -m "Add Steampipe Benchmark Results" git push ``` ## Configuring GitHub's OIDC provider for AWS In order for AWS to trust GitHub, you must configure OIDC as an identity provider in your AWS account. GitHub's [Security hardening your deployments](https://docs.github.com/en/actions/deployment/security-hardening-your-deployments) page has instructions for using OpenID Connect with various providers. If you prefer AWS CloudFormation, you can make use of this [link](https://github.com/aws-actions/configure-aws-credentials#sample-iam-role-cloudformation-template) to get started. To help you follow those instructions we have created a [Terraform sample](https://github.com/turbot/steampipe-samples/tree/main/all/github-actions-oidc/aws). This guide will demonstrate the Terraform implementation. This Terraform script will create two AWS resources, an Identity provider and IAM Role in your account. These resources together form an OIDC trust between the AWS IAM role and your GitHub workflow(s) that need access to the cloud. In order to execute the Terraform code and deploy the resources GitHub needs, you will need local credentials for the target AWS account. This can be via AWS Identity Center, IAM User Access Keys, or via environment variables. Regardless of how you [authenticate](https://registry.terraform.io/providers/hashicorp/aws/latest/docs#authentication-and-configuration), the permissions required to deploy the Terraform code must have permission to create IAM resources. ### Configuration Update the `default.tfvars` file for the below variables. * `github_repo`: GitHub repository that needs the access token. Example: octo-org/octo-repo * `github_branch`: GitHub branch that runs the workflow. If you plan to trigger the workflow through schedule, then this must be the default branch. If you plan to run the workflow manually, this can be any branch. Example: master * `aws_iam_role_name`: Name of the AWS IAM Role to create. Example: steampipe_gh_oidc_demo ### Implementation Navigate to the folder where the [Terraform sample for AWS](https://github.com/turbot/steampipe-samples/tree/main/all/github-actions-oidc/aws) is cloned. Run the below commands to create necessary resources in your AWS Account. ```bash # Initialize Terraform to get all necessary providers. terraform init # Apply the configuration using the configuration file "defaults.tfvars" terraform apply -var-file=default.tfvars ``` Successful execution of the above will give a Terraform output value of `OIDC_AWS_ROLE_TO_ASSUME`. This the ARN of the IAM role that handles the OIDC federation. Add `OIDC_AWS_ROLE_TO_ASSUME` and its value to the GitHub Secret in your repository as shown below.
gh_secret
### Validation Login to your AWS account to verify that Terraform has created the following resources. * AWS > IAM > Identity provider > token.actions.githubusercontent.com * AWS > IAM > Role (rolename: steampipe_gh_oidc_demo) The AWS IAM console should show the identity provider `token.actions.githubusercontent.com` as follows.
aws_iam_identity_provider
The IAM Role(steampipe_gh_oidc_demo) should show the following trust relationship.
aws_iam_role
## Running the workflow on-demand The job will run on schedule, but it's always helpful to [run manually](https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow) for sanity-check. Make sure you select the correct branch when executing this manually, this should be listed in the Trust Relationships of your IAM Role. (`github_branch` variable in the Terraform script).
manual_run
Upon successful run of the GitHub action(schedule or manual run), the Steampipe query result is automatically pushed to your GitHub repository. --- --- title: Installing Steampipe in GitHub Actions sidebar_label: Installing Steampipe --- # Installing Steampipe in GitHub Actions GitHub provides a [hosted environment](https://docs.github.com/en/actions/) in which you can build, test, and deploy software. ## Installing Steampipe To run scripts when you push changes to a GitHub repository, create a file `.github/workflows/steampipe.yml`. This will install the latest version of Steampipe. ``` name: Run Steampipe on: push: jobs: steampipe: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: turbot/steampipe-action-setup@v1 ``` ## Installing and configuring plugin(s) The [turbot/steampipe-action-setup](https://github.com/turbot/steampipe-action-setup) action can also install and configure plugins. ``` - uses: turbot/steampipe-action-setup@v1 with: steampipe-version: 'latest' plugin-connections: | connection "hackernews" { plugin = "hackernews" } ``` Next, add a step to run a query: ``` - uses: turbot/steampipe-action-setup@v1 with: steampipe-version: 'latest' plugin-connections: | connection "hackernews" { plugin = "hackernews" } - name: Query HN run: steampipe query "select id, title from hackernews_item where type = 'story' and title is not null order by id desc limit 5" ``` For more examples, please see [turbot/steampipe-action-setup examples](https://github.com/turbot/steampipe-action-setup#examples). --- --- title: GitHub Actions sidebar_label: GitHub Actions --- # Overview Steampipe brings powerful capabilities to your GitHub Actions pipelines. - **[Installing Steampipe on GitHub Actions →](/docs/integrations/github_actions/installing_steampipe)** - **[Authenticate to AWS with OIDC →](/docs/integrations/github_actions/aws_oidc)** --- --- title: Using Steampipe in a GitLab CI/CD Pipeline sidebar_label: GitLab CI/CD --- # Using Steampipe in a GitLab CI/CD Pipeline GitLab provides a [hosted environment](https://docs.gitlab.com/ee/ci/) in which you can build, test, and deploy software. This happens in a [GitLab Runner](https://docs.gitlab.com/runner/). Let's install Steampipe into a shared runner on gitlab.com, then install a plugin and run a query. ## Installing Steampipe in a GitLab Runner To run scripts when you push changes to a gitlab.com repo, you place them in a file called `.gitlab-ci.yml`. Here's an example that installs Steampipe into the runner's environment. ``` install: stage: build script: - echo "Hello, $GITLAB_USER_LOGIN, let's install Steampipe!" - /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)" ``` The official command to install Steampipe begins with `sudo`. That isn't necessary here, though, because in this environment you already are the root user. ## Running Steampipe in a GitLab Runner Steampipe cannot, however, run as root. So we'll create a non-privileged user, and switch to that user in order to run Steampipe commands. Our first command will install the [Hacker News](https://hub.steampipe.io/plugins/turbot/hackernews) plugin. ``` install: stage: build script: - echo "Hello, $GITLAB_USER_LOGIN, let's install Steampipe!" - adduser --disabled-password --shell /bin/bash jon - /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)" - su jon -c "steampipe plugin install hackernews" ```
gitlab-plugin-installed
Next, we'll add a file called `hn.sql` file to the repo. ```sql select id, title from hackernews_item where type = 'story' and title is not null order by id desc limit 5 ``` Finally, we'll copy `hn.sql` into the home directory of the non-privileged user, then run a query. ``` install: stage: build script: - echo "Hello, $GITLAB_USER_LOGIN, let's install Steampipe!" - adduser --disabled-password --shell /bin/bash jon - cp hn.sql /home/jon - cd /home/jon - ls -l - /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)" - su jon -c "steampipe plugin install hackernews" - su jon -c "steampipe query hn.sql" ```
gitlab-query-output
That's it! Now you can use any of Steampipe's [plugins](https://hub.steampipe.io/plugins) to enrich your GitLab pipelines. --- --- title: Using Steampipe in Gitpod sidebar_label: Gitpod --- # Using Steampipe in Gitpod [Gitpod](https://www.gitpod.io/) is an open source platform provisioning ready-to-code developer environments that integrates with GitHub. Here we integrate a Github project with Gitpod to install Steampipe, then install a plugin and run a query. ## Installing Steampipe in Gitpod To run scripts, first connect your GitHub repository to your Gitpod workspace and create a `.gitpod.yml` file that contains the definitions. Here's an example that installs Steampipe. ```yaml tasks: - name: Install Steampipe with RSS Plugin init: | sudo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)" steampipe -v ``` ## Running Steampipe in Gitpod In order to run Steampipe commands, we will first install the [RSS](https://hub.steampipe.io/plugins/turbot/rss) plugin. ```yaml tasks: - name: Install Steampipe with RSS Plugin init: | sudo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)" steampipe -v steampipe plugin install steampipe steampipe plugin install rss ports: # Steampipe/ PostgreSQL - port: 9193 ```
gitpod-plugin-installed
Next, we'll update the file with a query to list items from an RSS feed. ```yaml tasks: - name: Install Steampipe with RSS Plugin init: | sudo /bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/turbot/steampipe/main/install.sh)" steampipe -v steampipe plugin install steampipe steampipe plugin install rss steampipe query "select title, published, link from rss_item where feed_link = 'https://www.hardcorehumanism.com/feed/' order by published desc;" command: | steampipe service status ports: # Steampipe/ PostgreSQL - port: 9193 ```
gitpod-query-output
That's it! Now you can use any of Steampipe's [plugins](https://hub.steampipe.io/plugins) in your Gitpod workspace. --- --- title: Using Steampipe in Jenkins sidebar_label: Jenkins --- # Using Steampipe in Jenkins Jenkins provides a [hosted environment](https://www.jenkins.io/) in which you can build, test, and deploy software. This happens in a [Jenkins Pipeline](https://www.jenkins.io/doc/book/pipeline/). Let's use a pipeline to install Steampipe, then install a plugin and run a query. ## Installing Steampipe in a Jenkins pipeline To run scripts, you first create a `Jenkinsfile` which is a text file that contains the definition of a Jenkins Pipeline. Here's an example that installs Steampipe. ``` pipeline { agent any stages { stage("Install") { steps { sh "curl -s -L https://github.com/turbot/steampipe/releases/latest/download/steampipe_linux_amd64.tar.gz | tar -xzf -" echo "installed steampipe" } } } } ``` ## Running Steampipe in a Jenkins pipeline In order to run Steampipe commands, we will first install the [Hacker News](https://hub.steampipe.io/plugins/turbot/hackernews) plugin. ``` pipeline { agent any stages { stage("Install") { steps { sh "curl -s -L https://github.com/turbot/steampipe/releases/latest/download/steampipe_linux_amd64.tar.gz | tar -xzf -" echo "installed steampipe" sh './steampipe plugin install hackernews' } } } } ```
gitlab-plugin-installed
Next, we'll update the file to include a query to fetch the top 5 stories from `hackernews_top`. ``` pipeline { agent any stages { stage("Install") { steps { sh "curl -s -L https://github.com/turbot/steampipe/releases/latest/download/steampipe_linux_amd64.tar.gz | tar -xzf -" echo "installed steampipe" sh './steampipe plugin install hackernews' sh './steampipe query "select id, title, score from hackernews_top order by score desc limit 5"' } } } } ```
gitlab-query-output
That's it! Now you can use any of Steampipe's [plugins](https://hub.steampipe.io/plugins) to enrich your Jenkins pipelines. --- --- title: Integrations sidebar_label: Integrations --- # Integrations Steampipe is easy to install and run on your laptop or server, but is also simple to setup and run it from a variety of [cloud shells](#cloud-shells) and [CI/CD tools](#cicd-pipelines)! ## Cloud Shells Cloud Shells are browser-based terminals in which you can install Steampipe and run queries against the resources provided by AWS, Azure, GCP, or other clouds. Because they typically launch with the cloud provider's CLI already installed -- and configured with your credentials -- Cloud Shells can be the fastest and easiest way to query your cloud resources with Steampipe. - [AWS Cloud Shell](/docs/integrations/aws_cloudshell) - [Azure Cloud Shell](/docs/integrations/azure_cloudshell) - [AWS Cloud9](/docs/integrations/cloud9) - [Google Cloud Shell](/docs/integrations/gcp_cloudshell) - [Gitpod](/docs/integrations/gitpod) ## CI/CD Pipelines CI/CD pipelines enable you to install and run your own tools, including Steampipe. The examples here show you how to install Steampipe in CI/CD pipelines, then install Steampipe plugins, then run queries. - [CircleCI](/docs/integrations/circleci) - [GitHub Actions](/docs/integrations/github_actions/overview) - [GitLab CI/CD](/docs/integrations/gitlab_ci_cd) - [Jenkins](/docs/integrations/jenkins) --- --- id: learn title: Learn Steampipe sidebar_label: Learn Steampipe slug: / --- # Learn Steampipe Steampipe provides zero-ETL tools for fetching data directly from APIs and services. Steampipe is offered in several distributions: - The **Steampipe CLI** exposes APIs and services as a high-performance relational database, enabling you to write SQL-based queries to explore dynamic data. The Steampipe CLI is a turnkey solution that includes its own PostgreSQL database including plugin management. - **[Steampipe Postgres FDWs](/docs/steampipe_postgres/overview)** are native Postgres Foreign Data Wrappers that translate APIs to foreign tables. Unlike Steampipe CLI, which ships with its own Postgres server instance, the Steampipe Postgres FDWs can be installed in any supported Postgres database version. - **[Steampipe SQLite Extensions](/docs/steampipe_sqlite/overview)** provide SQLite virtual tables that translate your queries into API calls, transparently fetching information from your API or service as you request it. - **[Steampipe Export CLIs](/docs/steampipe_export/overview)** provide a flexible mechanism for exporting information from cloud services and APIs. Each exporter is a stand-alone binary that allows you to extract data using Steampipe plugins *without a database*. - **[Turbot Pipes](/docs/steampipe-cloud)** is the only intelligence, automation & security platform built specifically for DevOps. Pipes provides hosted Steampipe database instances, shared dashboards, snapshots, and more! This tutorial uses the Steampipe CLI. Let's dive in... ## Install the AWS plugin This tutorial uses the [AWS plugin](https://hub.steampipe.io/plugins/turbot/aws). To get started, [download and install Steampipe](/downloads), and then install the plugin: ```bash steampipe plugin install aws ``` Steampipe will download and install additional components the first time you run `steampipe query` so it may take a few seconds to load initially. Out of the box, Steampipe will use your default AWS credentials from your credential file and/or environment variables, so you'll need to make sure those are set up as well. If you can run `aws ec2 describe-vpcs`, you're good to go. (The AWS plugin provides additional examples to [configure your credentials](https://hub.steampipe.io/plugins/turbot/aws#configuring-aws-credentials), and even configure steampipe to query [multiple accounts](https://hub.steampipe.io/plugins/turbot/aws#multi-account-connections) and [multiple regions](https://hub.steampipe.io/plugins/turbot/aws#multi-region-connections ).) ## Explore Steampipe provides commands that allow you to discover and explore the tables and data without leaving the query shell. (Of course, this information is all available in [the hub](https://hub.steampipe.io/plugins/turbot/aws/tables) if online docs are more your speed...) Let's fire up Steampipe! Run `steampipe query` to open an interactive query session: ```bash $ steampipe query Welcome to Steampipe v0.5.0 For more information, type .help > ``` Now run the `.tables` meta-command to list the available tables: ``` > .tables ==> aws +----------------------------------------+---------------------------------------------+ | table | description | +----------------------------------------+---------------------------------------------+ | aws_accessanalyzer_analyzer | AWS Access Analyzer | | aws_account | AWS Account | | aws_acm_certificate | AWS ACM Certificate | | aws_api_gateway_api_key | AWS API Gateway API Key | ... +----------------------------------------+---------------------------------------------+ ``` As you can see, there are quite a few tables available in the AWS plugin! It looks like there's an `aws_iam_role` table - let's run `.inspect` to see what's in that table: ``` > .inspect aws_iam_role +---------------------------+-----------------------------+---------------------------------------------------------------------------------------------------+ | column | type | description | +---------------------------+-----------------------------+---------------------------------------------------------------------------------------------------+ | account_id | text | The AWS Account ID in which the resource is located. | | akas | jsonb | Array of globally unique identifier strings (also known as) for the resource. | | arn | text | The Amazon Resource Name (ARN) specifying the role. | | assume_role_policy | jsonb | The policy that grants an entity permission to assume the role. | | assume_role_policy_std | jsonb | Contains the assume role policy in a canonical form for easier searching. | | attached_policy_arns | jsonb | A list of managed policies attached to the role. | | create_date | timestamp without time zone | The date and time when the role was created. | | description | text | A user-provided description of the role. | | inline_policies | jsonb | A list of policy documents that are embedded as inline policies for the role.. | | inline_policies_std | jsonb | Inline policies in canonical form for the role. | | instance_profile_arns | jsonb | A list of instance profiles associated with the role. | | max_session_duration | bigint | The maximum session duration (in seconds) for the specified role. Anyone who uses the AWS CLI, or | | | | API to assume the role can specify the duration using the optional DurationSeconds API parameter | | | | or duration-seconds CLI parameter. | | name | text | The friendly name that identifies the role. | | partition | text | The AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov). | | path | text | The path to the role. | | permissions_boundary_arn | text | The ARN of the policy used to set the permissions boundary for the role. | | permissions_boundary_type | text | The permissions boundary usage type that indicates what type of IAM resource is used as the permi | | | | ssions boundary for an entity. This data type can only have a value of Policy. | | region | text | The AWS Region in which the resource is located. | | role_id | text | The stable and unique string identifying the role. | | role_last_used_date | timestamp without time zone | Contains information about the last time that an IAM role was used. Activity is only reported for | | | | the trailing 400 days. This period can be shorter if your Region began supporting these features | | | | within the last year. The role might have been used more than 400 days ago. | | role_last_used_region | text | Contains the region in which the IAM role was used. | | tags | jsonb | A map of tags for the resource. | | tags_src | jsonb | A list of tags that are attached to the role. | | title | text | Title of the resource. | +---------------------------+-----------------------------+---------------------------------------------------------------------------------------------------+ ``` ## Query Now that we know what columns are available in the `aws_iam_role` table, let's run a simple query to list the roles: ```sql select name from aws_iam_role ``` ``` +------------------------------------------------------------------+ | name | +------------------------------------------------------------------+ | AWSServiceRoleForOrganizations | | aws-elasticbeanstalk-service-role | | admin | | AWSServiceRoleForAmazonElasticsearchService | | user | | AWSServiceRoleForAccessAnalyzer | | CLoudtrailRoleForCloudwatchLogs | | aws-elasticbeanstalk-ec2-role | | rds_metadata | | metadata | | AWSServiceRoleForAutoScaling | | operator | | s3crr_role_for_vanedaly-replicated-bucket-01_to_test-repl-dest-f | | iam_owner | | ec2_owner | | ec2_operator | | AWSServiceRoleForSSO | +------------------------------------------------------------------+ ``` Now let's ask a more interesting question. Let's find roles that have no boundary policy applied: ```sql select name from aws_iam_role where permissions_boundary_arn is null; ``` ``` +------------------------------------------------------------------+ | name | +------------------------------------------------------------------+ | AWSServiceRoleForOrganizations | | aws-elasticbeanstalk-service-role | | AWSServiceRoleForAmazonElasticsearchService | | AWSServiceRoleForAccessAnalyzer | | CLoudtrailRoleForCloudwatchLogs | | aws-elasticbeanstalk-ec2-role | | AWSServiceRoleForAutoScaling | | s3crr_role_for_vanedaly-replicated-bucket-01_to_test-repl-dest-f | | AWSServiceRoleForSSO | +------------------------------------------------------------------+ ``` Like any database, we can join tables together as well. For instance, we can find all the roles that have AWS-managed policies attached: ```sql select r.name, policy_arn, p.is_aws_managed from aws_iam_role as r, jsonb_array_elements_text(attached_policy_arns) as policy_arn, aws_iam_policy as p where p.arn = policy_arn and p.is_aws_managed; ``` ``` +-------------------------------------------------------+------------------------------------------------------------------------------------+----------------+ | name | policy_arn | is_aws_managed | +-------------------------------------------------------+------------------------------------------------------------------------------------+----------------+ | aws-elasticbeanstalk-ec2-role | arn:aws:iam::aws:policy/AWSElasticBeanstalkWorkerTier | true | | aws-elasticbeanstalk-ec2-role | arn:aws:iam::aws:policy/AWSElasticBeanstalkMulticontainerDocker | true | | admin | arn:aws:iam::aws:policy/ReadOnlyAccess | true | | AWSServiceRoleForSSO | arn:aws:iam::aws:policy/aws-service-role/AWSSSOServiceRolePolicy | true | | AWSServiceRoleForAccessAnalyzer | arn:aws:iam::aws:policy/aws-service-role/AccessAnalyzerServiceRolePolicy | true | | aws-elasticbeanstalk-service-role | arn:aws:iam::aws:policy/service-role/AWSElasticBeanstalkEnhancedHealth | true | | AWSServiceRoleForElasticLoadBalancing | arn:aws:iam::aws:policy/aws-service-role/AWSElasticLoadBalancingServiceRolePolicy | true | | aws-elasticbeanstalk-service-role | arn:aws:iam::aws:policy/service-role/AWSElasticBeanstalkService | true | | AWSServiceRoleForOrganizations | arn:aws:iam::aws:policy/aws-service-role/AWSOrganizationsServiceTrustPolicy | true | +-------------------------------------------------------+------------------------------------------------------------------------------------+----------------+ ``` ## What's Next? We've merely scratched the surface of what you can do with Steampipe! - [Discover more plugins on the Steampipe Hub →](https://hub.steampipe.io/plugins/) - [Run dashboards and benchmarks with Powerpipe →](https://powerpipe.io) - [Build workflows as code with Flowpipe →](https://flowpipe.io) - [Join #steampipe on Slack →](https://turbot.com/community/join) - Want to share Steampipe with your team? [Try Turbot Pipes →](https://turbot.com/pipes) --- --- title: Managing Connections sidebar_label: Connections --- # Managing Connections A Steampipe **connection** represents a set of tables for a single data source. Each connection is represented as a distinct Postgres schema. A connection is associated with a single plugin type. The boundary/scope of the connection varies by plugin, but is typically aligned with the vendor's cli tool and/or api. For example: - An `azure` connection contains tables for a single Azure subscription - A `google` connection contains tables for a single GCP project - An `aws` connection contains tables for a single AWS account Many plugins will create a default connection when they are installed. This connection should be dynamic, and use the same scope and credentials that would be used for the equivalent CLI. Usually, this entails evaluating environment variables (`AWS_PROFILE`, `AWS_REGION`, `AZURE_SUBSCRIPTION_ID`, etc) and configuration files -- The details vary by provider. This means that by default, Steampipe "just works" per the CLI: - `select * from aws_ec2_instance` in the `aws` connection will target the same account/region as `aws ec2 describe-instances` - `select * from azure_compute_virtual_machine` in the `azure` connection works the same as `az vm list` Note that there is nothing special about the default connection, other than that it is created by default on plugin install - You can delete or rename this connection, or modify its configuration options (via the configuration file). ## Connection configuration files ### Structure Connection configurations are defined using HCL in one or more Steampipe config files. Steampipe will load ALL configuration files from `~/.steampipe/config` that have a `.spc` extension. A config file may contain multiple connections. Upon installation, a plugin may install a default configuration file, typically named `{plugin name}.spc`. This file usually contains a single connection, configured in such as way as to to dynamically match the configuration of the associated CLI. In addition, it may contain commented out sample connections for common configurations. For example, the `aws` plugin will install the `~/.steampipe/config/aws.spc` configuration file. This file contains a single `aws` connection definition that configures the plugin to use the same configuration as the `aws` cli. ### Syntax Steampipe config files use HCL Syntax, with connections defined in a `connection` block. The `connection` name will be used as the Postgres schema name in the Steampipe database. Each `connection` must contain a single `plugin` argument that specifies which plugin to use in this connection. Additional arguments are plugin-specific, and are used to determine the scope, credentials, and other configuration items. Note: Connection names typically use lowercase characters and underscores. It's possible to use other characters, but be aware that the schema names derived from such connection names will need to be quoted in SQL. A statement like `select * from aws_profile_1.aws_account requires no quotation`. But a statement like `select * from "Aws:01(profile2)".aws_account` does require quotation. The `plugin` argument should contain the path to the plugin relative to the plugin directory. Note that for standard Steampipe plugins that are installed from the Steampipe Hub, the short name may be used, and will use `latest` if the tag is omitted, thus the following are equivalent: ```hcl connection "aws" { plugin = "aws" } ``` ```hcl connection "aws" { plugin = "hub.steampipe.io/plugins/turbot/aws@latest" } ``` A plugin may define additional, plugin-specific arguments. For example, the AWS plugin allows you to define one or more regions to query, and either an AWS profile or key pair to use for authentication: ```hcl // default connection "aws" { plugin = "aws" } // credentials via profile connection "aws_profile2" { plugin = "aws" profile = "profile2" regions = ["us-east-1", "us-west-2"] } // credentials via key pair connection "aws_another_account" { plugin = "aws" secret_key = "gMCYsoGqjfThisISNotARealKeyVVhh" access_key = "ASIA3ODZSWFYSN2PFHPJ" regions = ["us-east-1"] } ``` Plugin-specific configuration details can be found in the plugin documentation on the [Steampipe Hub](https://hub.steampipe.io) ## Querying multiple connections A plugin may contain multiple connections: ```hcl // default connection "aws" { plugin = "aws" } connection "aws_01" { plugin = "aws" profile = "aws_01" regions = ["us-east-1", "us-west-2"] } connection "aws_02" { plugin = "aws" profile = "aws_02" regions = ["us-east-1", "us-west-2"] } connection "aws_03" { plugin = "aws" profile = "aws_03" regions = ["us-east-1", "us-west-2"] } ``` Each connection is implemented as a distinct [Postgres schema](https://www.postgresql.org/docs/current/ddl-schemas.html). As such, you can use qualified table names to query a specific connection: ```sql select * from aws_02.aws_account ``` Alternatively, can use an unqualified name and it will be resolved according to the [Search Path](#setting-the-search-path): ```sql select * from aws_account ``` ## Using Aggregators You can aggregate or search for data across multiple connections by using an **aggregator** connection. Aggregators allow you to query data from multiple connections for a plugin as if they are a single connection. For example, using aggregators, you can create tables that allow you to query multiple AWS accounts: ```hcl connection "aws_all" { plugin = "aws" type = "aggregator" connections = ["aws_01", "aws_02", "aws_03"] } ``` Querying tables from this connection will return results from the `aws_01`, `aws_02`, and `aws_03` connections: ```sql select * from aws_all.aws_account ``` Steampipe supports the `*` wildcard in the connection names. For example, to aggregate all the AWS plugin connections whose names begin with `aws_`: ```hcl connection "aws_all" { type = "aggregator" plugin = "aws" connections = ["aws_*"] } ``` Aggregators are powerful, but they are not infinitely scalable. Like any other steampipe connection, they query APIs and are subject to API limits and throttling. Consider as an example and aggregator that includes 3 AWS connections, where each connection queries 16 regions. This means you essentially run the same list API calls 48 times! When using aggregators, it is especially important to: - Query only what you need! `select * from aws_s3_bucket` must make a list API call in each connection, and then 11 API calls *for each bucket*, where `select name, versioning_enabled from aws_s3_bucket` would only require a single API call per bucket. - Consider extending the [cache TTL](/docs/reference/config-files/connection). The default is currently 300 seconds (5 minutes). Obviously, anytime steampipe can pull from the cache, it is faster and less impactful to the APIs. If you don't need the most up-to-date results, increase the cache TTL! ### Aggregating Dynamic Tables Most tables in Steampipe plugins are statically defined -- the column names and types are defined at compile time. As a result, all connections for a given table from a given plugin have the same structure and they can be aggregated by simply appending data. Some plugins define tables dynamically, and their structure is only known at runtime. The [`kubernetes` plugin](https://hub.steampipe.io/plugins/turbot/kubernetes), for example, creates some tables dynamically by reading the CRD data. Furthermore, the structure may not be identical across multiple connections. When Steampipe aggregates this data: - Steampipe performs a merge, where the table in the aggregator contains the union of all columns from all connections. - If a connection does not contain a given column, it will be null in the aggregated result for all rows from that connection. - If a column has the same name but different data type across connections, the column will be returned as JSONB. ## Setting the Search Path Postgres allows you to set a [schema search path](https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PATH) to control the resolution order of unqualified names. When using unqualified names, the first object in the search path that matches the object name will be used. For example, assume you have 3 connections that use the `aws` plugin, named `aws_01`, `aws_02`, and `aws_03`, and you run the query `select * from aws_account`. In this query, the table name is unqualified, so the first schema (connection) in the search path that implements the `aws_account` table will be used. By default, the search path puts the public schema first, followed by all connection schemas ordered alphabetically, thus the query will return results from `aws_01.aws_account`. To instead return results from `aws_02`, you can simply change the search path and re-run the query. Usually, you will not want to replace the entire search path, but rather *prefer* a given connection. To simplify this case, set the `search_path_prefix`. Setting the prefix will not *replace* the entire path, but will merely *prepend* the the prefix to the front of the search path. You can change the default search path in many places, and the active path will be determined from the most precise scope where it is set: 1. The session setting, as set by the most recent `.search_path` and/or .`search_path_prefix` meta-command. 1. The `--search-path` or `--search-path-prefix` command line arguments. 1. The `search_path` or `search_path_prefix` set in the `workspace`, in the `workspaces.spc` file. 1. The `search_path` or `search_path_prefix` set in the `database` global option, typically set in `~/.steampipe/config/default.spc` 1. The compiled default (`public`, then alphabetical by connection name) Note that setting the search path in the `workspace`, from the command line arguments, or via meta-commands sets the path for the session when running `steampipe`; this setting *will not* be in effect when connecting to Steampipe from 3rd party tools. Setting the `search_path` in the `database` options will set the `search_path` option in the database, however, and *will* be in effect when connecting from tools other than the `steampipe` cli. --- --- title: Run Steampipe sidebar_label: Run Steampipe --- # Manage Steampipe Steampipe is simple to install and manage and does not require any special expertise to get started. You will need to [install and update plugins](/docs/managing/plugins) and [manage connections](/docs/managing/connections), but any other configuration is optional. If you wish, you may [run Steampipe as a local service](/docs/managing/service), exposing the database endpoint for connection from any Postgres-compatible database client. --- --- title: Managing Plugins sidebar_label: Plugins --- # Managing Plugins Steampipe provides an integrated, standardized SQL interface for querying various services, but it relies on **plugins** to define and implement tables for those services. This approach decouples the core Steampipe code from the provider-specific implementations, providing flexibility and extensibility. ## Installing Plugins Steampipe plugins are packaged as Open Container Images (OCI) and stored in the [Steampipe Hub registry](https://hub.steampipe.io). This registry contains a curated set of plugins developed by and/or vetted by Turbot. To install the latest version of a standard plugin, you can simply install it by name. For example, to install the latest `aws` plugin: ``` $ steampipe plugin install aws ``` This will download the latest aws plugin from the hub registry, and will set up a default connection named `aws`. > Note: If you install multiple versions of a plugin only the first installation will create a connection automatically for you, you will need to create/edit a [connection](/docs/managing/connections) configuration file in order to use the additional versions of the plugin. ### Installing a Specific Version To install a specific version, simply specify the version tag after the plugin name, separated by `@` or `:` For example, to install the 0.118.0 version of the aws plugin: ``` $ steampipe plugin install aws@0.118.0 ``` This will download the aws plugin version 0.118.0 (the one with the `0.118.0` tag) from the hub registry. ### Installing from a SemVer Constraint Plugins should follow [semantic versioning](https://semver.org/) guidelines, and they are tagged in the registry with a **version tag** that specifies their *exact version* in the `major.minor.patch` format (e.g. `1.0.1`). The intent of the version tag is that it is immutable - while it is technically possible to move the version tag to a different image version, this should not be done. Installing with a semver constraint allows you to "lock" (or pin) to a specific set of releases which match the contraints. If you install via `steampipe plugin install aws@^1`, for example, `steampipe plugin update` (and auto-updates) will only update to versions greater than `1.0.0` but less than `2.0.0`. Supported semver constraint types: **Wildcard Constraint**: This matches any version for a particular segment (Major, Minor, or Patch). - `1.x.x` would match any version with major segment of `1`. - `1.2.x` would match any version with the major segment of `1` and a minor segment of `2`. **Caret Constraint (^)**: This matches versions that do not modify the left-most non-zero digit. - `^1.2.3` is the latest version equal or greater than `1.2.3`, but less than `2.0.0`. - `^0.1.2` is the latest version equal or greater than `0.1.2`, but less than `0.2.0`. **Tilde Constraint (~)**: This matches versions based on expression, if minor segment is expressed, locks to it, else locks to major. - `~1` is the latest version greater than or equal to `1.0.0`, but less than `2.0.0` (same as `1.x.x`). - `~1.2` is the latest version greater than or equal to `1.2.0`, but less than `1.3.0` (same as `1.2.x`). - `~1.2.3` is the latest version greater than or equal to `1.2.3`, but less than `1.3.0`. **Range Constraint**: This specifies a range of versions using a hyphen. - `1.2.3-1.2.5` would limit to latest available version of `1.2.3`,`1.2.4` or `1.2.5`. **Other Constraints**: - `>1.1.1` would match any version greater than `1.1.1`. - `>=1.2.0` would match any version greater than or equal to `1.2.0`. You can use the install command in the same way as a specific version with these constraints (`imagename@constraint`) syntax: > Note: For some constraints using special characters `>`, `<`, `*` you may need to escape the characters `\>` or quote the string `steampipe plugin install "aws@>0.118.0"` depending on your terminal. - To install the latest version locked to a specific major version: ```bash $ steampipe plugin install aws@^2 # or $ steampipe plugin install aws@2.x.x ``` - To install the latest version locked to a specific minor version: ```bash $ steampipe plugin install aws@~2.1 # or $ steampipe plugin install aws@2.1.x ``` ### Installing from another registry Steampipe plugins are packaged in OCI format and can be hosted and installed from any artifact repository or container registry that supports OCI V2 images. To install a plugin from a repository, specify the full path in the install command: ``` $ steampipe plugin install us-docker.pkg.dev/myproject/myrepo/myplugin@mytag ``` ### Installing from a File A plugin binary can be installed manually, and this is often convenient when developing the plugin. Steampipe will attempt to load any plugin that is referred to in a `connection` configuration: - The plugin binary file must have a `.plugin` extension - The plugin binary must reside in a subdirectory of the `~/.steampipe/plugins/` directory and must be the ONLY `.plugin` file in that subdirectory - The `connection` must specify the path (relative to `~/.steampipe/plugins/`) to the plugin in the `plugin` argument For example, consider a `myplugin` plugin that you have developed. To install it: - Create a subdirectory `.steampipe/plugins/local/myplugin` - Name your plugin binary `myplugin.plugin`, and copy it to `.steampipe/plugins/local/myplugin/myplugin.plugin` - Create a `~/.steampipe/config/myplugin.spc` config file containing a connection definition that points to your plugin: ```hcl connection "myplugin" { plugin = "local/myplugin" } ``` ### Installing Missing Plugins You can install all missing plugins that are referenced in your configuration files: ```bash $ steampipe plugin install ``` Running `steampipe plugin install` with no arguments will cause Steampipe to read all `connection` and `plugin` blocks in all `.spc` files in the `~/.steampipe/config` directory and install any that are referenced but are not installed. Note that when doing so, any default `.spc` file that does not exist in the configuration will also be copied. You may pass the `--skip-config` flag if you don't want to copy these files: ```bash $ steampipe plugin install --skip-config ``` ## Viewing Installed Plugins You can list the installed plugins with the `steampipe plugin list` command: ``` $ steampipe plugin list ┌─────────────────────────────────────────────────────┬─────────┬─────────────────────────────────────────────┐ │ NAME │ VERSION │ CONNECTIONS │ ├─────────────────────────────────────────────────────┼─────────┼─────────────────────────────────────────────┤ │ hub.steampipe.io/plugins/turbot/aws@latest │ 0.4.0 │ aws,aws_account_aaa,aws_account_aab │ │ hub.steampipe.io/plugins/turbot/digitalocean@latest │ 0.1.0 │ digitalocean │ │ hub.steampipe.io/plugins/turbot/gcp@latest │ 0.0.6 │ gcp_project_a,gcp,gcp_project_b │ │ hub.steampipe.io/plugins/turbot/github@latest │ 0.0.5 │ github │ │ hub.steampipe.io/plugins/turbot/steampipe@latest │ 0.0.2 │ steampipe │ └─────────────────────────────────────────────────────┴─────────┴─────────────────────────────────────────────┘ ``` ## Updating Plugins To update a plugin to the latest version for a given stream, you can use the `steampipe plugin update` command: ``` steampipe plugin update plugin_name[@stream] ``` The syntax and semantics are identical to the install command - `steampipe plugin update aws` will get the latest aws plugin, `steampipe plugin update aws@1` will get the latest in the 1.x major stream, etc. To update **all** plugins to the latest in the installed stream: ```bash steampipe plugin update --all ``` ## Uninstalling Plugins You can uninstall a plugin with the `steampipe plugin uninstall` command: ``` steampipe plugin uninstall [plugin] ``` Note that you can remove a plugin that has active connections using it. You should remove any connections for the uninstalled plugin as part of cleanup: ``` $ steampipe plugin uninstall azure Uninstalled plugin: * turbot/azure Please remove this connection to continue using steampipe: * /Users/cbruno/.steampipe/config/azure.spc 'dev' (line 1) 'staging' (line 6) 'prod' (line 11) ``` --- --- title: Service Mode sidebar_label: Service Mode --- # Service Mode By default, when you run `steampipe query`, Steampipe will start the database if it is not already running. In this case, the database only listens on the loopback address (127.0.0.1) - You cannot connect over the network. Steampipe will shut it down at the end of the query command or session if there are no other active steampipe sessions. Alternatively, you can run Steampipe in service mode. Running `steampipe service start` will run Steampipe as a local service, exposing it as a database endpoint for connection from any Postgres-compatible database client. ## Starting the database in service mode When you run `steampipe service start`, Steampipe will start in service mode. Steampipe prints connection information to the console that you can use in connection strings for your application or 3rd party tools: ```bash $ steampipe service start Steampipe service is running: Database: Host(s): localhost, 127.0.0.1, 192.168.10.174 Port: 9193 Database: steampipe User: steampipe Password: 4cbe-4bc2-9c18 Connection string: postgres://steampipe:4cbe-4bc2-9c18@localhost:9193/steampipe Managing the Steampipe service: # Get status of the service steampipe service status # Restart the service steampipe service restart # Stop the service steampipe service stop ``` Once the service is started, you can [connect to the Steampipe](/docs/integrations/overview) from tools that integrate with Postgres. ## Stopping the service To stop the Steampipe service, issue the `steampipe service stop` command. --- --- title: Managing Workspaces sidebar_label: Workspaces --- # Managing Workspaces A Steampipe `workspace` is a "profile" that allows you to define a unified environment that the Steampipe client can interact with. Each workspace is composed of: - a single Steampipe database instance - context-specific settings and options (snapshot location, query timeout, etc) Steampipe workspaces allow you to [define multiple named configurations](#defining-workspaces): ```hcl workspace "local" { workspace_database = "local" } workspace "acme_prod" { workspace_database = "acme/prod" snapshot_location = "acme/prod" query_timeout = 600 } ``` and [easily switch between them](#using-workspaces) using the `--workspace` argument or `STEAMPIPE_WORKSPACE` environment variable: ```bash steampipe query --workspace local "select * from aws_account" steampipe query --workspace acme_prod "select * from aws_account" ``` Turbot Pipes workspaces are [automatically supported](#implicit-workspaces): ```bash steampipe query --workspace acme/dev "select * from aws_account" ``` ## Defining Workspaces [Workspace](/docs/reference/config-files/workspace) configurations can be defined in any `.spc` file in the `~/.steampipe/config` directory, but by convention they are defined in `~/.steampipe/config/workspaces.spc` file. This file may contain multiple `workspace` definitions that can then be referenced by name. Any unset arguments will assume the default values - you don't need to set them all: ```hcl workspace "default" { query_timeout = 300 } ``` You can use `base=` to inherit settings form another profile: ```hcl workspace "dev" { base = workspace.default workspace_database = "acme/dev" } ``` The `workspace_database` may be `local` (which is the default): ```hcl workspace "local_db" { workspace_database = "local" } ``` or a Turbot Pipes workspace, in the form of `{identity_handle}/{workspace_handle}`: ```hcl workspace "acme_prod" { workspace_database = "acme/prod" } ``` The `snapshot_location` can also be a Turbot Pipes workspace, in the form of `{identity_handle}/{workspace_handle}`: ```hcl workspace "acme_prod" { workspace_database = "acme/prod" snapshot_location = "acme/prod" } ``` If it doesn't match the `{identity_handle}/{workspace_handle}` pattern it will be interpreted to be a path to a directory in the local filesystem where snapshots should be written to: ```hcl workspace "local" { workspace_database = "local" snapshot_location = "home/raj/my-snapshots" } ``` You can specify [`options` blocks to set options for steampipe query](/docs/reference/config-files/options#query-options): ```hcl workspace "local_dev" { search_path_prefix = "aws_all" query_timeout = 300 pipes_token = "tpt_999faketoken99999999_111faketoken1111111111111" pipes_host = "pipes.turbot.com" snapshot_location = "acme/dev" workspace_database = "local" options "query" { multi = false # true, false output = "table" # json, csv, table, line header = true # true, false separator = "," # any single char timing = true # true, false autocomplete = true } } ``` You can even set the `install_dir` for a workspace if you want to use the data layer from another [Steampipe installation directory](https://steampipe.io/docs/reference/env-vars/steampipe_install_dir). This allows you to define workspaces that use a database from another installation directory: ```hcl workspace "steampipe_2" { install_dir = "~/steampipe2" } ``` and easily switch between them with the `--workspace` flag: ```bash steampipe query --workspace steampipe_2 "select * from aws_account" ``` ## Using Workspaces Workspaces may be defined in any `.spc` file in the `~/.steampipe/config` directory, but by convention they should be placed in the `~/.steampipe/config/workspaces.spc` file. The workspace named `default` is special; if a workspace named `default` exists, `--workspace` is not specified in the command, and `STEAMPIPE_WORKSPACE` is not set, then Steampipe uses the `default` workspace: ```bash steampipe query --snapshot "select * from aws_account" ``` You can pass any workspace to `--workspace` to use its values: ```bash steampipe query --snapshot --workspace=acme_dev "select * from aws_account" ``` Or do the same with the `STEAMPIPE_WORKSPACE` environment variable: ```bash STEAMPIPE_WORKSPACE=acme_dev steampipe query --snapshot "select * from aws_account" ``` If you specify the `--workspace` argument and the `STEAMPIPE_WORKSPACE` environment variable, the `--workspace` argument wins: ```bash # acme_prod will be used as the effective workspace export STEAMPIPE_WORKSPACE=acme_dev steampipe query --snapshot --workspace=acme_prod "select * from aws_account" ``` If you specify the `--workspace` argument and more specific arguments (`workspace_database`, etc), any more specific arguments will override the workspace values: ```bash # will use "local" as the db, and acme_prod workspace for any OTHER options steampipe query --snapshot \ --workspace=acme_prod \ --workspace_database=local \ "select * from aws_account" ``` Environment variable values override `default` workspace settings when the `default` workspace is *implicitly used*: ```bash # will use acme/dev as DB, but get the rest of the values from default workspace export STEAMPIPE_WORKSPACE_DATABASE=acme/dev steampipe query "select * from aws_account" ``` If the default workspace is *explicitly* passed to the `--workspace` argument, its values will override any individual environment variables: ```bash # will NOT use acme/dev as DB - will use ALL of the values from default workspace export STEAMPIPE_WORKSPACE_DATABASE=acme/dev steampipe query --snapshot --workspace=default "select * from aws_account" ``` The same is true of any named workspace: ```bash # will NOT use acme/dev as DB - will use ALL of the values from acme_prod workspace export STEAMPIPE_WORKSPACE_DATABASE=acme/dev steampipe query --workspace=acme_prod "select * from aws_account" ``` ## Implicit Workspaces Named workspaces follow normal standards for HCL identifiers, thus they cannot contain the slash (`/`) character. If you pass a value to `--workspace` or `STEAMPIPE_WORKSPACE` in the form of `{identity_handle}/{workspace_handle}`, it will be interpreted as an **implicit workspace**. Implicit workspaces, as the name suggests, do not need to be specified in the `workspaces.spc` file. Instead they will be assumed to refer to a Turbot Pipes workspace, which will be used as both the database (`workspace_database`) and snapshot location (`snapshot_location`). Essentially, `--workspace acme/dev` is equivalent to: ```hcl workspace "acme/dev" { workspace_database = "acme/dev" snapshot_location = "acme/dev" } ``` --- --- title: Flowpipe sidebar_label: Flowpipe --- # Flowpipe [Flowpipe](https://flowpipe.io/) is an automation and workflow engine designed for DevOps tasks. It allows you to define and execute complex workflows using code, making it ideal for automating cloud infrastructure management across platforms like AWS, Azure, and GCP. Flowpipe enables you to turn your Steampipe insights into action! - Detect and correct misconfigurations leading to cost savings opportunities with Thrifty mods for [AWS](https://hub.flowpipe.io/mods/turbot/aws_thrifty), [Azure](https://hub.flowpipe.io/mods/turbot/azure_thrifty), or [GCP](https://hub.flowpipe.io/mods/turbot/gcp_thrifty). - Automate your resource tagging standards with Tags mods for [AWS](https://hub.flowpipe.io/mods/turbot/aws_tags) or [Azure](https://hub.flowpipe.io/mods/turbot/azure_tags). - [Build your own mods](https://flowpipe.io/docs/build) with simple HCL to create custom pipelines tailored to your specific needs. --- --- title: Pipes Ecosystem sidebar_label: Pipes Ecosystem --- # Pipes Ecosystem Steampipe is a part of a larger ecosystem of products that provide a flexible platform on which to build rich intelligence, automation & security solutions. - [Steampipe](https://steampipe.io/) provides a dynamic cloud inventory, allowing you to to query and analyze data from multiple sources via SQL. Its plugin-based architecture ensures extensibility, enabling connections to APIs, cloud services, databases, and more. - [Powerpipe](/docs/pipes-ecosystem/powerpipe) adds analysis and visualization to your Steampipe data, providing dashboards, reports, and benchmarks to analyze and audit your environment. - [Flowpipe](/docs/pipes-ecosystem/flowpipe) further enriches the platform with workflow orchestration. Flowpipe allows you to create "pipelines as code" to define workflows and other tasks that run in a sequence. With Flowpipe, you can turn your Steampipe insights into into action! - [Turbot Pipes](/docs/pipes-ecosystem/pipes) provides a hosted cloud intelligence, automation & security platform built specifically for DevOps. Pipes provides a managed Steampipe database instances, shared Powerpipe dashboards & snapshots, Flowpipe triggers & pipelines, and more! --- --- title: Turbot Pipes sidebar_label: Turbot Pipes --- # Turbot Pipes **[Turbot Pipes](/docs/steampipe-cloud)** is the only intelligence, automation & security platform built specifically for DevOps. Pipes provides hosted Steampipe database instances, shared dashboards, snapshots, and more! While the Steampipe CLI is optimized for a single developer doing a single thing at a single point in time, Pipes is designed for many users doing many things across time. Turbot Pipes provides additional benefits above and beyond the Steampipe CLI: - **Managed Steampipe instance**. The Steampipe workspace database instance hosted in Turbot Pipes is available via a public Postgres endpoint. You can query the workspace from the Turbot Pipes web console, run queries or controls from a remote Steampipe CLI instance, or connect to your workspace from many [third-party tools](https://turbot.com/pipes/docs/connect). - **Multi-user support**. Steampipe [Organizations](https://turbot.com/pipes/docs/organizations) allow you to collaborate and share workspaces and connections with your team. With [Pipes Enterprise](https://turbot.com/pipes/docs/plans/enterprise), you can create your own isolated [Tenant](https://turbot.com/pipes/docs/tenants), with a custom domain for your environment (e.g. `acme.pipes.turbot.com`). Your tenant has its own discrete set of user accounts, organizations, and workspaces, giving you centralized visibility and control. You can choose which [authentication methods](https://turbot.com/pipes/docs/tenants/settings#authentication-methods) are allowed, configure which [domains to trust](https://turbot.com/pipes/docs/tenants/settings#trusted-login-domains), and set up [SAML](https://turbot.com/pipes/docs/tenants/settings#saml) to integrate your Pipes tenant with your single-sign-on solution. - **Snapshot Scheduling and sharing**. Turbot Pipes allows you to[ save and share dashboard snapshots](https://turbot.com/pipes/docs/dashboards#saving--sharing-snapshots), either internally with your team or publicly with a sharing URL. You can even [schedule snapshots ](https://turbot.com/pipes/docs/queries#scheduling-query-snapshots) and be notified when complete. - **Persistent CMDB with Datatank**. A Turbot Pipes [Datatank](https://turbot.com/pipes/docs/datatank) provides a mechanism to proactively query connections at regular intervals and store the results in a persistent schema. You can then query the stored results instead of the live schemas, resulting in reduced query latency (at the expense of data freshness). There's no cost to get started! - **[Sign up for Turbot Pipes →](https://pipes.turbot.com)** - **[Take me to the docs →](https://turbot.com/pipes/docs)** --- --- title: Powerpipe sidebar_label: Powerpipe --- # Powerpipe [Powerpipe](https://powerpipe.io/) is an open-source platform designed for building and visualizing cloud infrastructure dashboards, security benchmarks, and compliance controls. Powerpipe works seamlessly with Steampipe, enabling you to [visualize cloud configurations](#visualize-your-data-with-powerpipe-dashboards) and [assess security posture against a massive library of benchmarks](#run-security-and-compliance-benchmarks-with-powerpipe-benchmarks). ## Visualize your data with Powerpipe Dashboards [Visualize your cloud infrastructure](https://powerpipe.io/docs?slug=#visualize-cloud-infrastructure) with Powerpipe + Steampipe! Use [pre-built dashboards](https://hub.powerpipe.io/?objectives=dashboard&engines=steampipe) — for AWS, Azure, GCP, Kubernetes, and more — to visualize your cloud resources and answer common questions about resource quantity, cost, usage, and relationships, or [build your own]((https://powerpipe.io/docs?slug=#create-your-own-dashboards-and-benchmarks))! ![](/pipes-ecosystem/vpc_detail.png) ## Run security and compliance benchmarks with Powerpipe Benchmarks [Run security and compliance benchmarks](https://powerpipe.io/docs?slug=#run-security-and-compliance-benchmarks) with Powerpipe + Steampipe! [Use pre-built benchmarks](https://hub.powerpipe.io/?objectives=cost,compliance,security,tags&engines=steampipe) to assess how well your clouds comply with the standard frameworks including CIS, GDPR, NIST, PCI, SOC 2, and more, or [build your own]((https://powerpipe.io/docs?slug=#create-your-own-dashboards-and-benchmarks))! ![](/pipes-ecosystem/benchmark_dashboard_view.png) --- --- title: Batch Queries sidebar_label: Batch Queries --- # Batch Queries Steampipe queries can provide valuable insight into your cloud configuration, and the interactive client is a powerful tool for ad hoc queries and exploration. Often, however, you will write a query that you will want to re-run in the future, either manually or perhaps as a cron job. Steampipe allows you to save your query to a file, and pass the file into the `steampipe query` command. For example, lets create a query to find S3 buckets where versioning is not enabled. Paste the following snippet into a file named `s3_versioning_disabled.sql`: ```sql select name, region, account_id, versioning_enabled from aws_s3_bucket where not versioning_enabled; ``` We can now run the query by passing the file name to `steampipe query` ```bash steampipe query s3_versioning_disabled.sql ``` You can even run multiple sql files by passing a glob or a space separated list of file names to the command: ```bash steampipe query *.sql ``` ## Query output formats By default, the output format is `table`, which provides a tabular, human-readable view: ``` +-----------------------+---------------+-----------+ | vpc_id | cidr_block | state | +-----------------------+---------------+-----------+ | vpc-0de60777fdfd2ebc7 | 10.66.8.0/22 | available | | vpc-9d7ae1e7 | 172.31.0.0/16 | available | | vpc-0bf2ca1f6a9319eea | 172.16.0.0/16 | available | +-----------------------+---------------+-----------+ ``` You can use the `--output` argument to output in a different format. To print your output to json, specify `--output json`: ``` $ steampipe query "select vpc_id, cidr_block,state from aws_vpc" --output json [ { "cidr_block": "10.66.8.0/22", "state": "available", "vpc_id": "vpc-0de60777fdfd2ebc7" }, { "cidr_block": "172.31.0.0/16", "state": "available", "vpc_id": "vpc-9d7ae1e7" }, { "cidr_block": "172.16.0.0/16", "state": "available", "vpc_id": "vpc-0bf2ca1f6a9319eea" } ] ``` To print your output to csv, specify `--output csv`: ``` $ steampipe query "select vpc_id, cidr_block,state from aws_vpc" --output csv vpc_id,cidr_block,state vpc-0de60777fdfd2ebc7,10.66.8.0/22,available vpc-9d7ae1e7,172.31.0.0/16,available vpc-0bf2ca1f6a9319eea,172.16.0.0/16,available ``` Redirecting the output to CSV is common way to export data for use in other tools, such as Excel: ``` steampipe query "select vpc_id, cidr_block,state from aws_vpc" --output csv > vpcs.csv ``` To use a different delimiter, you can specify the `--separator` argument. For example, to print to a pipe-separated format: ``` $ steampipe query "select vpc_id, cidr_block,state from aws_vpc" --output csv --separator '|' vpc_id|cidr_block|state vpc-0bf2ca1f6a9319eea|172.16.0.0/16|available vpc-9d7ae1e7|172.31.0.0/16|available vpc-0de60777fdfd2ebc7|10.66.8.0/22|available ``` --- --- title: AI Tools (MCP) --- # AI Tools (MCP) The [Steampipe MCP server](https://github.com/turbot/steampipe-mcp) (Model Control Protocol) transforms how you interact with your cloud infrastructure data. It brings the power of conversational AI to your cloud resources and configurations, allowing you to extract critical insights using plain English — no complex SQL required! The Steampipe [MCP](https://modelcontextprotocol.io/introduction) enables Large Language Models (LLMs) to query your Steampipe data directly. This allows you to query your cloud infrastructure using natural language, making data exploration and analysis more intuitive and accessible. It works with both local [Steampipe](https://steampipe.io/downloads) installations and [Turbot Pipes](https://turbot.com/pipes) workspaces, providing safe, read-only access to all your cloud and SaaS data. The MCP is packaged separately and runs as an integration in your AI tool, such as [Claude Desktop](https://claude.ai/download) or [Cursor](https://www.cursor.com/). ## Installation ### Prerequisites - [Steampipe](https://steampipe.io/downloads) installed and configured - [Node.js](https://nodejs.org/) v16 or higher (includes `npx`) - An AI assistant that supports [MCP](https://modelcontextprotocol.io/introduction), such as [Cursor](https://www.cursor.com/) or Anthropic's [Claude Desktop](https://claude.ai/download). ### Configuration The Steampipe MCP server is packaged and distributed as an NPM package; just add Steampipe MCP to your AI assistant's configuration file and restart your AI assistant for the changes to take effect: ```json { "mcpServers": { "steampipe": { "command": "npx", "args": [ "-y", "@turbot/steampipe-mcp" ] } } } ``` By default, this connects to your local Steampipe installation at `postgresql://steampipe@localhost:9193/steampipe`. Make sure to run `steampipe service start` first. To connect to a [Turbot Pipes](https://turbot.com/pipes) workspace instead, add your [connection string](https://turbot.com/pipes/docs/using/steampipe/developers#database) to the args: ```json { "mcpServers": { "steampipe": { "command": "npx", "args": [ "-y", "@turbot/steampipe-mcp", "postgresql://my_name:my_pw@workspace-name.usea1.db.pipes.turbot.com:9193/abc123" ] } } } ``` | Assistant | Config File Location | Setup Guide | |-----------|---------------------|-------------| | Claude Desktop | `claude_desktop_config.json` | [Claude Desktop MCP Guide →](https://modelcontextprotocol.io/quickstart/user) | | Cursor | `~/.cursor/mcp.json` | [Cursor MCP Guide →](https://docs.cursor.com/context/model-context-protocol) | Refer to the [README](https://github.com/turbot/steampipe-mcp/blob/main/README.md) for additional configuration options. ## Querying Steampipe To query Steampipe, just ask questions using natural language! Explore your cloud infrastructure: ``` What AWS accounts can you see? ``` Simple, specific questions work well: ``` Show me all S3 buckets that were created in the last week ``` Generate infrastructure reports: ``` List my EC2 instances with their attached EBS volumes ``` Dive into security analysis: ``` Find any IAM users with access keys that haven't been rotated in the last 90 days ``` Get compliance insights: ``` Show me all EC2 instances that don't comply with our tagging standards ``` Explore potential risks: ``` Analyze my S3 buckets for security risks including public access, logging, and encryption ``` ## Best Practices for Prompts To get the most accurate and helpful responses from the MCP service, consider these best practices when formulating your prompts: 1. **Use natural language**: The LLM will handle the SQL translation 2. **Be specific**: Indicate which cloud resources you want to analyze (EC2, S3, IAM, etc.) 3. **Include context**: Mention regions or accounts if you're interested in specific ones 4. **Ask for explanations**: Request the LLM to explain its findings after presenting the data 5. **Iterative refinement**: Start with simple queries and then refine based on initial results 6. **Be bold and exploratory**: It's amazing what the LLM will discover and achieve! ## Limitations - The quality of SQL generation depends on the LLM's understanding of your prompt and the Steampipe schema. - Complex analytical queries may require iterative refinement. - Response times depend on both the LLM API latency and query execution time. - The MCP server only runs locally at this time. You must run it from the same machine as your AI assistant. - A valid subscription to the LLM provider is recommended; free plan limits are often insufficient for using the Steampipe MCP server. --- --- title: Query Steampipe sidebar_label: Query Steampipe --- # Query Steampipe Steampipe is built on [PostgreSQL](https://www.postgresql.org/), and you can use [standard SQL syntax](https://www..org/docs/14/sql.html) to query Steampipe. It's easy to [get started writing queries](/docs/sql/steampipe-sql), and the [Steampipe Hub](https://hub.steampipe.io/mods) provides ***thousands of example queries*** that you can use or modify for your purposes. There are [example queries for each table](https://hub.steampipe.io/plugins/turbot/aws/tables/aws_s3_bucket) in every plugin, and you can also [browse, search, and view the queries](https://hub.steampipe.io/mods/turbot/aws_insights/queries) in every published mod! ## Interactive Query Shell Steampipe provides an [interactive query shell](/docs/query/query-shell) that provides features like auto-complete, syntax highlighting, and command history to assist you in writing queries. To open the query shell, run `steampipe query` with no arguments: ```bash $ steampipe query > ``` Notice that the prompt changes, indicating that you are in the Steampipe shell. You can exit the query shell by pressing `Ctrl+d` on a blank line, or using the `.exit` command. ## Non-interactive (batch) query mode The Steampipe interactive query shell is a great platform for exploring your data and developing queries, but Steampipe is more than just a query shell! Steampipe allows you to [run a query in batch mode](/docs/query/batch-query) and write the results to standard output (stdout). This is useful if you wish to redirect the output to a file, pipe to another command, or export data for use in other tools. To run a query from the command line, specify the query as an argument to steampipe query: ```bash steampipe query "select vpc_id, cidr_block, state from aws_vpc" ``` ## AI Tools (MCP) The [Steampipe MCP server](/docs/query/mcp) transforms how you interact with your cloud infrastructure data. It brings the power of conversational AI to your cloud resources and configurations, allowing you to extract critical insights using plain English — no complex SQL required! The MCP is packaged separately and runs as an integration in your AI tool, such as [Claude Desktop](https://claude.ai/download) or [Cursor](https://www.cursor.com/). ## Third Party Tools Because Steampipe is built on Postgres, you can [connect to the Steampipe database with 3rd party tools](/docs/query/third-party), or write code against your database using your favorite library! --- --- title: Interactive Queries sidebar_label: Interactive Queries --- # Interactive Query Shell Steampipe provides an interactive query shell that provides features like auto-complete, syntax highlighting, and command history to assist you in [writing queries](/docs/sql/steampipe-sql). To open the query shell, run `steampipe query` with no arguments: ```bash $ steampipe query > ``` Notice that the prompt changes, indicating that you are in the Steampipe shell. You can exit the query shell by pressing `Ctrl+d` on a blank line, or using the `.exit` command. ### Autocomplete The query shell includes an autocomplete feature that will suggest words as you type. Type `.` (period). Notice that the autocomplete appears with a list of the [Steampipe meta-commands](/docs/reference/dot-commands/overview) commands that start with `.`: ![](/auto-complete-1.png) As you continue to type, the autocomplete will continue to narrow down the list of tables to only those that match. You can cycle forward through the list with the `Tab` key, or backward with `Shift+Tab`. Tab to select `.tables` and hit enter. The `.tables` command is executed, and lists all the tables that are installed and available to query. ### History The query shell supports command history, allowing you to retrieve, run, and edit previous commands. The command history works like typical unix shell command history, and persists across query sessions. When on a new line, you can cycle back through the history with the `Up Arrow` or `Ctrl+p` and forward with `Down Arrow` or `Ctrl+n`. ### Key bindings The query shell supports standard emacs-style key bindings: | Keys | Description |-|- | `Ctrl+a` | Move the cursor to the beginning of the line | `Ctrl+e` | Move the cursor to the end of the line | `Ctrl+f` | Move the cursor forward 1 character | `Ctrl+b` | Move the cursor backward 1 character | `Ctrl+w` | Delete a word backwards | `Ctrl+d` | Delete a character forwards. On a blank line, `Ctrl+d` will exit the console | `Backspace` | Delete a character backwards | `Ctrl+p`, `Up Arrow` | Go to the previous command in your history | `Ctrl+n`, `Down Arrow` | Go to the next command in your history ## Exploring Tables & Connections ### Connections A Steampipe **Connection** represents a set of tables for a single data source. Each connection is represented as a distinct Postgres schema. A connection is associated with a single instance of a single [plugin](/docs/managing/plugins) type. The boundary and scope of the connection varies by plugin, but is typically aligned with the vendor's CLI tool or API: - An `azure` connection contains tables for a single Azure subscription - An `aws` connection contains tables for a single AWS account To view the installed connections, you can use the `.connections` : ``` > .connections +------------+--------------------------------------------------+ | Connection | Plugin | +------------+--------------------------------------------------+ | aws | hub.steampipe.io/plugins/turbot/aws@latest | | github | hub.steampipe.io/plugins/turbot/github@latest | | steampipe | hub.steampipe.io/plugins/turbot/steampipe@latest | +------------+--------------------------------------------------+ To get information about the tables in a connection, run '.inspect {connection}' To get information about the columns in a table, run '.inspect {connection}.{table}' ``` Alternately, you can use `.inspect` command with no arguments. The output is the same: ``` > .inspect +------------+--------------------------------------------------+ | Connection | Plugin | +------------+--------------------------------------------------+ | aws | hub.steampipe.io/plugins/turbot/aws@latest | | github | hub.steampipe.io/plugins/turbot/github@latest | | steampipe | hub.steampipe.io/plugins/turbot/steampipe@latest | +------------+--------------------------------------------------+ To get information about the tables in a connection, run '.inspect {connection}' To get information about the columns in a table, run '.inspect {connection}.{table}' ``` ### Tables Steampipe **tables** provide an interface for querying dynamic data using standard SQL. Steampipe tables do not actually *store* data, they query the source on the fly. The details are hidden from you though - *you just query them like any other table!* To view the tables in all active connections, you can use the `.tables` command: ``` > .tables ==> aws +----------------------------------------+--------------------------------+ | Table | Description | +----------------------------------------+--------------------------------+ | aws_acm_certificate | AWS ACM Certificate | | aws_api_gateway_api_key | AWS API Gateway API Key | | aws_api_gateway_authorizer | AWS API Gateway Authorizer | ... +----------------------------------------+--------------------------------+ ==> github +---------------------+-------------+ | Table | Description | +---------------------+-------------+ | github_gist | | | github_license | | | github_organization | | | github_repository | | | github_team | | | github_user | | +---------------------+-------------+ To get information about the columns in a table, run '.inspect {connection}.{table}' ``` To view only the tables in a specific connection, you can use the `.inspect` command with a connection name. For example, to show all the tables in the `aws` connection: ``` > .inspect aws +----------------------------------------+--------------------------------+ | Table | Description | +----------------------------------------+--------------------------------+ | aws_acm_certificate | AWS ACM Certificate | | aws_api_gateway_api_key | AWS API Gateway API Key | | aws_api_gateway_authorizer | AWS API Gateway Authorizer | | aws_api_gateway_rest_api | AWS API Gateway Rest API | ... +----------------------------------------+--------------------------------+ To get information about the columns in a table, run '.inspect {connection}.{table}' ``` ### Columns To get information about the **columns** in a table, run `.inspect {connection}.{table}`: ``` > .inspect aws.aws_iam_group +----------------------+-----------------------------+--------------------------------+ | Column | Type | Description | +----------------------+-----------------------------+--------------------------------+ | account_id | text | The AWS Account ID in which | | | | the resource is located | | akas | jsonb | A list of AKAs (also-known-as) | | | | that uniquely identify this | | | | resource | | arn | text | The Amazon Resource Name (ARN) | | | | specifying the group | | attached_policy_arns | jsonb | A list of managed policies | | | | attached to the group | | create_date | timestamp without time zone | The date and time, when the | | | | group was created | | group_id | text | The stable and unique string | | | | identifying the group | | inline_policies | jsonb | A list of policy documents | | | | that are embedded as inline | | | | policies for the group | | name | text | The friendly name that | | | | identifies the group | | partition | text | The AWS partition in which | | | | the resource is located (aws, | | | | aws-cn, or aws-us-gov) | | path | text | The path to the group | | region | text | The AWS Region in which the | | | | resource is located | | title | text | The display name for this | | | | resource | | users | jsonb | A list of users in the group | +----------------------+-----------------------------+--------------------------------+ ``` --- --- title: Snapshots sidebar_label: Snapshots --- # Snapshots Steampipe allows you to take **snapshots**. A snapshot is a saved view of your query results that you can view as a [dashboard in Powerpipe](https://powerpipe.io/docs/run/dashboard) All data and metadata for a snapshot is contained in a JSON file which can be saved and viewed locally in the Powerpipe dashboard or uploaded to [Turbot Pipes](https://turbot.com/pipes/docs). Snapshots in Turbot Pipes may be shared with other Turbot Pipes users or made public (shared with anyone that has the link). You can create Turbot Pipes snapshots directly from the Steampipe CLI, however if you wish to subsequently [modify](https://turbot.com/pipes/docs/dashboards#managing-snapshots) them (add/remove tags, change visibility) or delete them, you must do so from the Turbot Pipes console. You may [browse the snapshot list](https://turbot.com/pipes/docs/dashboards#browsing-snapshots) in Turbot Pipes by clicking the **Snapshots** button on the top of your workspace's **Dashboards** page. ## Taking Snapshots > To upload snapshots to Turbot Pipes, you must either [log in via the `steampipe login` command](/docs/reference/cli/login) or create an [API token](https://turbot.com/pipes/docs/profile#tokens) and pass it via the [`--pipes-token`](/docs/reference/cli/overview#global-flags) flag or [`PIPES_TOKEN`](/docs/reference/env-vars/pipes_token) environment variable. To take a snapshot and save it to [Turbot Pipes](https://turbot.com/pipes/docs), simply add the `--snapshot` flag to your command. ```bash steampipe query --snapshot "select * from aws_ec2_instance" ``` The `--snapshot` flag will create a snapshot with `workspace` visibility in your user workspace. A snapshot with `workspace` visibility is visible only to users that have access to the workspace in which the snapshot resides -- A user must be authenticated to Turbot Pipes with permissions on the workspace. If you want to create a snapshot that can be shared with *anyone*, use the `--share` flag instead. This will create the snapshot with `anyone_with_link` visibility: ```bash steampipe query --share "select * from aws_ec2_instance" ``` You can set a snapshot title in Turbot Pipes with the `--snapshot-title` argument. ```bash steampipe query --share --snapshot-title "Public Buckets" "select name from aws_s3_bucket where bucket_policy_is_public" ``` If you wish to save to the snapshot to a different workspace, such as an org workspace, you can use the `--snapshot-location` argument with `--share` or `--snapshot`: ```bash steampipe query --share --snapshot-location vandelay-industries/latex "select * from aws_ec2_instance" ``` Note that the previous command ran the query against the *local* database, but saved the snapshot to the `vandelay-industries/latex` workspace. If you want to run the query against the remote `vandelay-industries/latex` database AND store the snapshot there, you can also add the `--database-location` argument: ```bash steampipe query --share --snapshot-location vandelay-industries/latex \ --workspace-database vandelay-industries/latex \ "select * from aws_ec2_instance" ``` Steampipe provides a shortcut for this though. The `--workspace` flag supports [passing the cloud workspace](/docs/managing/workspaces#implicit-workspaces): ```bash steampipe query --snapshot --workspace vandelay-industries/latex "select * from aws_ec2_instance" ``` While not a common case, you can even run a query against a Turbot Pipes workspace database, but store the snapshot in an entirely different Turbot Pipes workspace: ```bash steampipe query --share --snapshot-location vandelay-industries/latex \ --workspace vandelay-industries/latex-prod \ "select * from aws_ec2_instance" ``` ## Tagging Snapshots You may want to tag your snapshots to make it easier to organize them. You can use the `--snapshot-tag` argument to add a tag: ```bash steampipe query --snapshot-tag env=local --snapshot \ "select * from aws_ec2_instance" ``` Simply repeat the flag to add more than one tag: ```bash steampipe query --snapshot-tag env=local --snapshot --snapshot-tag owner=george \ "select * from aws_ec2_instance" ``` ## Saving Snapshots to Local Files Turbot Pipes makes it easy to save and share your snapshots, however it is not strictly required; You can save and view snapshots using only the CLI. You can specify a local path in the `--snapshot-location` argument or `STEAMPIPE_SNAPSHOT_LOCATION` environment variable to save your snapshots to a directory in your filesystem: ```bash steampipe query --snapshot --snapshot-location . "select * from aws_account" ``` You can also set `snapshot_location` in a [workspace](/docs/managing/workspaces) if you wish to make it the default location. Alternatively, you can use the `--export` argument to export a query in the Steampipe snapshot format. This will create a file with a `.sps` extension in the current directory: ```bash steampipe query --export sps "select * from aws_account" ``` The `snapshot` export/output type is an alias for `sps`: ```bash steampipe query --export snapshot "select * from aws_account" ``` To give the file a name, simply use `{filename}.sps`, for example: ```bash steampipe query --export aws_accounts.sps "select * from aws_account" ``` Alternatively, you can write the steampipe snapshot to stdout with `--output sps` ```bash steampipe query --output sps "select * from aws_account" > aws_accounts.sps ``` or `--output snapshot` ```bash steampipe query --output snapshot "select * from aws_account" > aws_accounts.sps ``` ## Controlling Output When using `--share` or `--snapshot`, the output will include the URL to view the snapshot that you created in addition to the usual output: ```bash Snapshot uploaded to https://pipes.turbot.com/user/costanza/workspace/vandelay/snapshot/snap_abcdefghij0123456789_asdfghjklqwertyuiopzxcvbn ``` You can use the `--progress=false` argument to suppress displaying the URL and other progress data. This may be desirable when you are using an alternate output format, especially when piping the output to another command: ```bash steampipe query --snapshot --output json \ --progress=false "select * from aws_account" | jq ``` --- --- title: It's Just Postgres! sidebar_label: It's Just Postgres! --- # It's Just Postgres! Because Steampipe is built on Postgres, you can export your data, connect to the Steampipe database with 3rd party tools, or write code against your database using your favorite library. By default, when you run `steampipe query`, Steampipe will start the database and shut it down at the end of the query command or session. To connect from third party tools, you must run `steampipe service start` to start steampipe in [service mode](/docs/managing/service). Once the service is started, you can connect to the Steampipe from tools that integrate with Postgres, such as [TablePlus](https://tableplus.com/)! Query your data with 3rd part tools like TablePlus

To stop the Steampipe service, issue the `steampipe service stop` command. --- --- title: steampipe completion sidebar_label: steampipe completion --- # steampipe completion Generate the autocompletion script for `steampipe` for supported shells. This helps you configure your terminal’s shell so that `steampipe` commands autocomplete when you press the TAB key. ## Usage ```bash steampipe completion [bash|fish|zsh] ``` ## Sub-Commands | Command | Description |-|- | `bash` | Generate completion code for `bash` | `fish` | Generate completion code for `fish` | `zsh` | Generate completion code for `zsh` ### steampipe completion bash Generate the autocompletion script for the `bash` shell. #### Pre-requisites This script depends on the `bash-completion` package. If it is not installed already, you can install it via your OS’s package manager. Most Linux distributions have bash-completion installed by default, however it is not installed by default in Mac OS. For example, to install the [bash-completion package with homebrew](https://formulae.brew.sh/formula/bash-completion@2): ```bash brew install bash-completion ``` Once installed, edit your `.bash_profile` or `.bashrc` file and add the following line: ```bash [[ -r "$(brew --prefix)/etc/profile.d/bash_completion.sh" ]] && . "$(brew --prefix)/etc/profile.d/bash_completion.sh" ``` #### Examples Review the configuration: ```bash steampipe completion bash ``` Enable auto-complete in your current shell session: ``` source <(steampipe completion bash) ``` Enable auto-complete for every new session (execute once). You will need to start a new shell for this setup to take effect: Linux: ```bash steampipe completion bash > /etc/bash_completion.d/steampipe ``` MacOS: ```bash steampipe completion bash > /usr/local/etc/bash_completion.d/steampipe ``` ### steampipe completion fish Generate the autocompletion script for the `fish` shell. #### Examples Review the configuration: ```bash steampipe completion fish ``` Enable auto-complete in your current shell session: ```bash steampipe completion fish | source ``` Enable auto-complete for every new session (execute once). You will need to start a new shell for this setup to take effect: ```bash steampipe completion fish > ~/.config/fish/completions/steampipe.fish ``` ### steampipe completion zsh Generate the autocompletion script for the `zsh` shell. #### Pre-requisites If shell completion is not enabled in your environment, you will need to enable it using: ```bash echo "autoload -U compinit; compinit" >> ~/.zshrc ``` You will need to start a new shell for this setup to take effect. #### Examples Review the configuration: ```bash steampipe completion zsh ``` Enable auto-complete for every new session (execute once). You will need to start a new shell for this setup to take effect: ```bash steampipe completion zsh > "${fpath[1]}/_steampipe" && compinit ``` --- --- title: steampipe help sidebar_label: steampipe help --- # steampipe help Display help and usage information for any command in the application. ## Usage ```bash steampipe help [command] [flags] ``` ## Examples Show help: ```bash steampipe help ``` Show help for the `plugin` sub-command: ```bash steampipe help plugin ``` Show help for the `plugin install` sub-command: ```bash steampipe help plugin install ``` --- --- title: steampipe login sidebar_label: steampipe login --- # steampipe login Log in to [Turbot Pipes](https://turbot.com/pipes/docs). The Steampipe CLI can interact with Turbot Pipes to run queries against a remote cloud database. This capability requires authenticating to Turbot Pipes. The `steampipe login` command launches an interactive process for logging in and obtaining a temporary (30 day) token. The token is written to `~/.pipes/internal/{cloud host}.tptt`. ## Usage ```bash steampipe login ``` ## Flags:
Argument Description
`--pipes-host` Sets the Turbot Pipes host used when connecting to Turbot Pipes workspaces. See PIPES_HOST for details.
## Examples Login to `pipes.turbot.com`: ```bash steampipe login ``` The `steampipe login` command will launch your web browser to continue the login process. Verify the request. After you have verified the request, the browser will display a verification code. Paste the code into the cli and hit enter to complete the login process: ```bash $ steampipe login Verify login at https://pipes.turbot.com/login/token?r=tpttr_cdckfake6ap10t9dak0g_3u2k9hfake46g4o4wym7h8hw Enter verification code: 745278 Login successful for user johnsmyth ``` --- --- title: Steampipe CLI sidebar_label: Steampipe CLI --- # Steampipe CLI ## Sub-Commands | Command | Description |-|- | [steampipe completion](/docs/reference/cli/completion)| Generate the autocompletion script for the specified shell | [steampipe help](/docs/reference/cli/help) | Help about any command | [steampipe login](/docs/reference/cli/login) | Log in to Steampipe CLoud | [steampipe plugin](/docs/reference/cli/plugin) | Steampipe plugin management | [steampipe query](/docs/reference/cli/query) | Execute SQL queries interactively or by argument | [steampipe service](/docs/reference/cli/service)| Steampipe service management ## Global Flags
Flag Description
`-h`, `--help` Help for Steampipe.
`--install-dir` Sets the directory for the Steampipe installation, in which the Steampipe database, plugins, and supporting files can be found. See STEAMPIPE_INSTALL_DIR for details.
`--workspace` Sets the Steampipe workspace profile. If not specified, the `default` workspace will be used if it exists. See STEAMPIPE_WORKSPACE for details.
`-v`, `--version` Display Steampipe version.
## Exit Codes | Value | Name | Description |---------|---------------------------------------|---------------------------------------- | **0** | `ExitCodeSuccessful` | Steampipe ran successfully, with no runtime errors, control errors, or alarms | **11** | `ExitCodePluginLoadingError` | Plugin loading error | **12** | `ExitCodePluginListFailure` | Plugin listing failed | **13** | `ExitCodePluginNotFound` | Plugin not found | **14** | `ExitCodePluginInstallFailure` | Plugin install failed | **31** | `ExitCodeServiceSetupFailure` | Service setup failed | **32** | `ExitCodeServiceStartupFailure` | Service start failed | **33** | `ExitCodeServiceStopFailure` | Service stop failed | **41** | `ExitCodeQueryExecutionFailed` | One or more queries failed for `steampipe query` | **51** | `ExitCodeLoginCloudConnectionFailed` | Connecting to cloud failed | **249** | `ExitCodeInvalidExecutionEnvironment` | Steampipe was run in an unsupported environment | **250** | `ExitCodeInitializationFailed` | Initialization failed | **251** | `ExitCodeBindPortUnavailable` | Network port binding failed | **253** | `ExitCodeFileSystemAccessFailure` | File system access failed | **254** | `ExitCodeInsufficientOrWrongInputs` | Runtime error - insufficient or incorrect input | **255** | `ExitCodeUnknownErrorPanic` | Runtime error - an unknown panic occurred --- --- title: steampipe plugin sidebar_label: steampipe plugin --- # steampipe plugin Steampipe plugin management. Plugins extend Steampipe to work with many different services and providers. Find plugins using the public registry at [hub.steampipe.io](https://hub.steampipe.io). ## Usage ```bash steampipe plugin [command] ``` ## Available Commands: | Command | Description |-|- | `install` | Install one or more plugins | `list` | List currently installed plugins | `uninstall` | Uninstall a plugin | `update ` | Update one or more plugins
Flag Description
`--all` Applies only to `plugin update`, updates ALL installed plugins.
`--progress` Enable or disable progress information. By default, progress information is shown - set `--progress=false` to hide the progress bar. Applies only to `plugin install` and `plugin update`.
`--skip-config ` Applies only to `plugin install`, skip creating the default config file for plugin.
## Examples Install or update a plugin: ```bash steampipe plugin install aws ``` Install a specific version of a plugin: ```bash steampipe plugin install aws@0.107.0 ``` Install the latest version of a plugin matching a semver constraint: ```bash steampipe plugin install aws@^0.107 ``` Note: if your semver constraint contain special characters you may need to quote it: ```bash steampipe plugin install "aws@>=0.100" ``` Install all missing plugins that specified in configuration files. Do not download their default configuration files: ```bash steampipe plugin install --skip-config ``` List installed plugins: ```bash steampipe plugin list ``` Uninstall a plugin: ```bash steampipe plugin uninstall dmi/paper ``` Update all plugins to the latest in the installed stream: ```bash steampipe plugin update --all ``` Update the aws plugin to the latest version meeting the constraint: ```bash steampipe plugin update aws@^0.107 ``` Update all plugins to the latest and hide the progress bar: ```bash steampipe plugin update --all --progress=false ``` --- --- title: steampipe query sidebar_label: steampipe query --- # steampipe query Execute SQL queries interactively, or by a query argument. To open the interactive query shell, run `steampipe query` with no arguments. The query shell provides a way to explore your data and run multiple queries. If a query string is passed on the command line then it will be run immediately and the command will exit. Alternatively, you may specify one or more files containing SQL statements. You can run multiple SQL files by passing a glob or a space-separated list of file names. If the Steampipe service was previously started by `steampipe service start`, steampipe will connect to the service instance - otherwise, the query command will start the `service`. At the end of the query command or session, if other sessions have not connected to the `service` already, the `service` will be shutdown. If other sessions have already connected to the `service`, then the last session to exit will shutdown the `service`. ## Usage Run Steampipe [interactive query shell](/docs/query/query-shell): ```bash steampipe query [flags] ``` Run a [batch query](/docs/query/batch-query): ```bash steampipe query {query} [flags] ``` ## Flags
Argument Description
`--export string` Export query output to a file. You may export multiple output formats by entering multiple `--export` arguments. If a file path is specified as an argument, its type will be inferred by the suffix. Supported export formats are `sps` (`snapshot`).
`--header string` Specify whether to include column headers in csv and table output (default `true`).
`--help` Help for `steampipe query.`
`--output string` Select the console output format. Possible values are `line, csv, json, table, snapshot` (default `table) `.
`--pipes-host` Sets the Turbot Pipes host used when connecting to Turbot Pipes workspaces. See PIPES_HOST for details.
`--pipes-token` Sets the Turbot Pipes authentication token used when connecting to Turbot Pipes workspaces. See PIPES_TOKEN for details.
`--progress` Enable or disable progress information. By default, progress information is shown - set `--progress=false` to hide the progress bar.
`--query-timeout int` The query timeout, in seconds. The default is `0` (no timeout).
`--search-path strings` Set a comma-separated list of connections to use as a custom search path for the query session.
`--search-path-prefix strings` Set a comma-separated list of connections to use as a prefix to the current search path for the query session.
`--separator string` A single character to use as a separator string for csv output (defaults to ",")
`--share` Create snapshot in Turbot Pipes with `anyone_with_link` visibility.
`--snapshot` Create snapshot in Turbot Pipes with the default (`workspace`) visibility.
`--snapshot-location string` The location to write snapshots - either a local file path or a Turbot Pipes workspace
`--snapshot-tag string=string ` Specify tags to set on the snapshot. Multiple `--snapshot-tag ` arguments may be passed.
`--snapshot-title string=string ` The title to give a snapshot when uploading to Turbot Pipes.
`--timing=string ` Enable or disable query execution timing: `off` (default), `on`, or `verbose`
`--workspace-database` Sets the database that Steampipe will connect to. This can be `local` (the default) or a remote Turbot Pipes database. See STEAMPIPE_WORKSPACE_DATABASE for details.
## Examples Open an interactive query console: ```bash steampipe query ``` Run a specific query directly: ```bash steampipe query "select * from aws_s3_bucket" ``` Run a query and save a [snapshot](/docs/snapshots/batch-snapshots): ```bash steampipe query --snapshot "select * from aws_s3_bucket" ``` Run a query and share a [snapshot](/docs/snapshots/batch-snapshots): ```bash steampipe query --share "select * from aws_s3_bucket" ``` Run the SQL command in the `my_queries/my_query.sql` file: ```bash steampipe query my_queries/my_query.sql ``` Run the SQL commands in all `.sql` files in the `my_queries` directory and concatenate the results: ```bash steampipe query my_queries/*.sql ``` Run a query and report the query execution time: ```bash steampipe query "select * from aws_s3_bucket" --timing ``` Run a query and report the query execution time and details for each scan: ```bash steampipe query "select * from aws_s3_bucket" --timing=verbose ``` Run a query and return output in json format: ```bash steampipe query "select * from aws_s3_bucket" --output json ``` Run a query and return output in CSV format: ```bash steampipe query "select * from aws_s3_bucket" --output csv ``` Run a query and return output in pipe-separated format: ```bash steampipe query "select * from aws_s3_bucket" --output csv --separator '|' ``` Run a query with a specific search_path: ```bash steampipe query --search-path="aws_dmi,github,slack" "select * from aws_s3_bucket" ``` Run a query with a specific search_path_prefix: ```bash steampipe query --search-path-prefix="aws_dmi" "select * from aws_s3_bucket" ``` --- --- title: steampipe service sidebar_label: steampipe service --- # steampipe service Steampipe service management. `steampipe service` allows you to run Steampipe as a local service, exposing it as a database endpoint for connection from any Postgres-compatible database client. ## Usage ```bash steampipe service [command] ``` ## Sub-Commands | Command | Description |-|- | `restart` | Restart Steampipe service | `start` | Start Steampipe in service mode | `status` | Status of the Steampipe service | `stop` | Stop Steampipe service ## Flags | Flag | Applies to | Description |-|-|- | `--database-listen string` | `start` | Accept database connections from: `local` (localhost only) or `network` (open) | `--database-password string` | `start` | Set the steampipe database password for this session. See [STEAMPIPE_DATABASE_PASSWORD](/docs/reference/env-vars/steampipe_database_password) for additional information | `--database-port int` | `start` | Database service port (default 9193) | `--force` | `stop`, `restart` | Forces the service to shutdown, releasing all open connections and ports | `--foreground` | `start` | Run the service in the foreground | `--show-password` | `start`, `status` | View database password for connecting from another machine (default false) | `--all` | `status` | Bypass the `--install-dir` and print status of all running services ## Examples Start Steampipe in the background (service mode): ```bash steampipe service start ``` Start Steampipe on port 9194 ```bash steampipe service start --database-port 9194 ``` Start the Steampipe service with a custom password: ```bash steampipe service start --database-password MyCustomPassword ``` Start Steampipe on `localhost` only ```bash steampipe service start --database-listen local ``` Stop the Steampipe service: ```bash steampipe service stop ``` Forcefully kill all Steampipe services: ```bash steampipe service stop --force ``` View Steampipe service status: ```bash steampipe service status ``` View Steampipe service status and display the database password: ```bash steampipe service status --show-password ``` View status of all running Steampipe services: ```bash steampipe service status --all ``` Restart the Steampipe service: ```bash steampipe service restart ``` --- --- title: connection sidebar_label: connection --- # connection The `connection` block defines a Steampipe [plugin connection](/docs/managing/plugins#installing-plugins) or [aggregator](/docs/managing/connections#using-aggregators). Most `connection` arguments are plugin-specific, and they are used to specify credentials, accounts, and other options. The [Steampipe Hub](https://hub.steampipe.io/plugins) provides detailed information about the arguments for each plugin. ## Supported options | Argument | Default | Values | Description |-|-|-|- | `import_schema` | `enabled` | `enabled`, `disabled` | Enable or disable the creation of a Postgres schema for this connection. When `import_schema` is disabled, Steampipe will not create a schema for the connection (and will delete it if it exists), but the connection will still be queryable from any aggregator that includes it. For installations with a large number of connections, setting `import_schema` to `disabled` can decrease startup time and increase performance. | `plugin` | none | [plugin version string](#plugin-version-strings) or [plugin reference](/docs/reference/config-files/plugin) | The plugin version / instance that this connection uses. This must refer to an [installed plugin version](/docs/managing/plugins#installing-plugins). | `type` | `plugin` | `plugin`, `aggregator` | The type of connection - [plugin connection](/docs/managing/plugins#installing-plugins) or [aggregator](/docs/managing/connections#using-aggregators). | `{plugin argument}`| varies | varies| Additional options are defined in each plugin - refer to the documentation for your plugin on the [Steampipe Hub](https://hub.steampipe.io/plugins). ### Plugin Version Strings Steampipe plugin versions are in the format: ``` [{organization}/]{plugin name}[@release stream] ``` The `{organization}` is optional, and if it is not specified, it is assumed to be `turbot`. The `{release stream}` is also optional, and defaults to `@latest`. As a result, plugin version are usually simple plugin names: ```hcl connection "net" { plugin = "net" # this is the same as turbot/net@latest } ``` You may specify a [specific version](/docs/managing/plugins#installing-a-specific-version): ```hcl connection "net" { plugin = "net@0.6.0" } ``` Or a [release stream](/docs/managing/plugins#installing-from-a-release-stream): ```hcl connection "net" { plugin = "net@0.6" } ``` For third-party plugins, the `{organization}` must be specified: ```hcl connection "scalingo" { plugin = "francois2metz/scalingo" } ``` You can even use a [local path](/docs/managing/plugins#installing-from-a-file) while developing plugins: ```hcl connection "myplugin" { plugin = "local/myplugin" } ``` ## Examples Connections using [plugin version strings](#plugin-version-strings): ```hcl connection "aws_all" { type = "aggregator" plugin = "aws" connections = ["aws_*"] } connection "aws_01" { plugin = "aws" profile = "aws_01" regions = ["*"] } connection "aws_02" { plugin = "aws" import_schema = "disabled" profile = "aws_02" regions = ["us-*", "eu-*"] } connection "aws_03" { plugin = "aws" aws_access_key_id = AKIA4YFAKEKEYXTDS252 aws_secret_access_key = SH42YMW5p3EThisIsNotRealzTiEUwXN8BOIOF5J8m regions = ["us-east-1", "us-west-2"] } ``` Connections using [plugin reference](/docs/reference/config-files/plugin): ```hcl plugin "aws" { memory_max_mb = 2048 } connection "aws_all" { type = "aggregator" plugin = plugin.aws connections = ["aws_*"] } connection "aws_01" { plugin = plugin.aws profile = "aws_01" regions = ["*"] } connection "aws_02" { plugin = plugin.aws import_schema = "disabled" profile = "aws_02" regions = ["us-*", "eu-*"] } ``` --- --- title: options sidebar_label: options --- # options Configuration options are defined using HCL `options` blocks in one or more Steampipe config files. Steampipe will load ALL configuration files from `~/.steampipe/config` that have a `.spc` extension. By default, Steampipe creates a `~/.steampipe/config/default.spc` file for setting `options`. Note that many of the `options` settings can also be specified via other mechanisms, such as command line arguments, environment variables, etc. These settings are resolved in a standard order: 1. Explicitly set in session (via a meta-command). 2. Specified in command line argument. 3. Set in an environment variable. 4. Set in a configuration file `options` argument. 5. If not specified, a default value is used. The following `options` are currently supported: | Option Type | Description |-|- | [database](#database-options) | Database options. | [general](#general-options) | General CLI options, such as auto-update options. | [plugin](#plugin-options) | Plugin options. --- ## Database Options **Database** options are used to control database options, such as the IP address and port on which the database listens. ### Supported options | Argument | Default | Values | Description |-|-|-|- | `cache` | `true` | `true`, `false` | Enable or disable query caching. This can also be set via the [STEAMPIPE_CACHE](/docs/reference/env-vars/steampipe_cache) environment variable. | `cache_max_size_mb` | unlimited | an integer | The maximum total size of the query cache across all plugins. This can also be set via the [STEAMPIPE_CACHE_MAX_SIZE_MB](/docs/reference/env-vars/steampipe_cache_max_size_mb) environment variable. | `cache_max_ttl` | `300` | an integer | The maximum length of time to cache query results, in seconds. This can also be set via the [STEAMPIPE_CACHE_MAX_TTL](/docs/reference/env-vars/steampipe_cache_max_ttl) environment variable. | `listen` | `network` | `local`, `network`| The network listen mode when Steampipe is started in [service mode](/docs/managing/service#starting-the-database-in-service-mode). Use `network` to listen on all IP addresses, or `local` to restrict to localhost. | `port` | `9193` | any valid, open port number | The TCP port that Postgres will listen on. | `search_path` | All connections, alphabetically | Comma separated string | Set an exact [search path](/docs/managing/connections#setting-the-search-path). Note that setting the search path in the database options sets it in the database; this setting will also be in effect when connecting to Steampipe from 3rd-party tools. See also: [Using search_path to target connections and aggregators](https://steampipe.io/docs/guides/search-path). | `search_path_prefix` | none | Comma separated string | Move one or more connections or aggregators to the front of the [search path](/docs/managing/connections#setting-the-search-path). Note that setting the search path prefix in the database options sets in the database; this setting will also be in effect when connecting to Steampipe from 3rd-party tools. See also: [Using search_path to target connections and aggregators](https://steampipe.io/docs/guides/search-path). | `start_timeout` | `30` | an integer | The maximum time (in seconds) to wait for the Postgres process to start accepting queries after it has been started. This can also be set via the [STEAMPIPE_DATABASE_START_TIMEOUT](/docs/reference/env-vars/steampipe_database_start_timeout) environment variable. ### Example: Database Options ```hcl options "database" { cache = true # true, false cache_max_ttl = 900 # max expiration (TTL) in seconds cache_max_size_mb = 1024 # max total size of cache across all plugins port = 9193 # any valid, open port number listen = "local" # local, network search_path_prefix = "aws,aws2,gcp,gcp2" # comma-separated string; an exact search_path start_timeout = 30 # maximum time (in seconds) to wait for the database to start up } ``` --- ## General options **General** options apply generally to the Steampipe CLI. ### Supported options | Argument | Default | Values | Description |-|-|-|- | `log_level` | `warn` | `trace`, `debug`, `info`, `warn`, `error` | Sets the output logging level. Standard log levels are supported. This can also be set via the [STEAMPIPE_LOG_LEVEL](/docs/reference/env-vars/steampipe_log) environment variable. | `memory_max_mb` | `1024` | | Set a memory soft limit for the `steampipe` process. Set to `0` to disable the memory limit. This can also be set via the [STEAMPIPE_MEMORY_MAX_MB](/docs/reference/env-vars/steampipe_memory_max_mb) environment variable. | `telemetry` | `none` | `none`, `info` | Set the telemetry level in Steampipe. This can also be set via the [STEAMPIPE_TELEMETRY](/docs/reference/env-vars/steampipe_telemetry) environment variable. See also: [Telemetry](https://steampipe.io/blog/release-0-15-0#telemetry). | `update_check` | `true` | `true`, `false` | Enable or disable automatic update checking. This can also be set via the [STEAMPIPE_UPDATE_CHECK](/docs/reference/env-vars/steampipe_update_check) environment variable. ### Example: General Options ```hcl options "general" { log_level = "warn" # trace, debug, info, warn, error memory_max_mb = 512 # megabytes telemetry = "info" # info, none update_check = true # true, false } ``` --- ## Plugin Options **Plugin** options are used to set plugin default options, such as memory soft limits. ### Supported options | Argument | Default | Values | Description |-|-|-|- | `memory_max_mb` | `1024` | Set a default memory soft limit for each plugin process. Note that each plugin can have its own `memory_max_mb` set in [a `plugin` definition](/docs/reference/config-files/plugin), and that value would override this default setting. Set to `0` to disable the memory limit. This can also be set via the [STEAMPIPE_PLUGIN_MEMORY_MAX_MB](/docs/reference/env-vars/steampipe_plugin_memory_max_mb) environment variable. ### Example: Plugin Options ```hcl options "plugin" { memory_max_mb = 2048 # megabytes } ``` --- --- title: Configuration Files sidebar_label: Configuration Files --- # Configuration Files Configuration file resource are defined using HCL in one or more Steampipe config files. Steampipe will load ALL configuration files from `~/.steampipe/config` that have a `.spc` extension. Typically, config files are laid out as follows: - Steampipe creates a `~/.steampipe/config/default.spc` file for setting [options](/docs/reference/config-files/options). - Each plugin creates a `~/.steampipe/config/{plugin name}.spc` (e.g. `aws.spc`, `github.spc`, `net.spc`, etc). Define your [connections](/docs/reference/config-files/connection) and [plugins](/docs/reference/config-files/plugin) in these files. - Define your [workspaces](/docs/reference/config-files/workspace) in `~/.steampipe/config/workspaces.spc`. --- --- title: plugin sidebar_label: plugin --- # plugin The `plugin` block enables you to set plugin-level options like soft memory limits and rate limiters. You can then associate connections with the the plugin. ```hcl plugin "aws" { memory_max_mb = 2048 limiter "aws_global" { max_concurrency = 200 } } ``` ## Supported options | Argument | Default | Description |-|-|- | `source` | none | A [plugin version string](#plugin-version-strings) the specifies which plugin this configuration applies to. If not specified, the plugin block label is assumed to be the plugin source. | `memory_max_mb` | `1024` | The soft memory limit for the plugin, in MB. Steampipe sets `GOMEMLIMIT` for the plugin process to the specified value. The Go runtime does not guarantee that the memory usage will not exceed the limit, but rather uses it as a target to optimize garbage collection. | `limiter` | none | Optional [limiter](#limiter) blocks used to set concurrency and/or rate limits ## Plugins and Connections You may optionally define a `plugin` in a `.spc` file to set plugin-level options like soft memory limits and rate limiters. ```hcl plugin "aws" { memory_max_mb = 2048 } ``` The block label is assumed to be the plugin short name unless the `source` argument is present. The label may only contain alphanumeric characters, dashes, or underscores. The `source` argument, however, accepts any [plugin version string](/docs/reference/config-files/connection#plugin-version-strings) allowing you to refer to any version. ```hcl plugin "my_aws" { source = "aws@0.41.0" memory_max_mb = 2048 } ``` In a `connection` you may continue to use the current syntax for the `plugin` argument. Steampipe will resolve the `connection` to the `plugin` as long as they resolve to the same plugin version: ```hcl connection "aws" { plugin = "aws" } plugin "aws" { memory_max_mb = 2048 } ``` Note that if a `connection` specifies a plugin version string that resolves to more than 1 plugin instance, Steampipe will not be able to load the connection, as it cannot assume which plugin instance to resolve to. For example, this configuration will cause a warning and the connection will be in error: ```hcl connection "aws" { plugin = "aws" } plugin "aws" { memory_max_mb = 2048 } plugin "aws_low_mem" { source = "aws" memory_max_mb = 512 } ``` You may instead specify a reference to a `plugin` block in your `connection` to disambiguate: ```hcl connection "aws" { plugin = plugin.aws } plugin "aws" { memory_max_mb = 2048 } plugin "aws_low_mem" { source = "aws" memory_max_mb = 512 } ```
Steampipe will create a separate plugin process for each `plugin` defined that has connections associated to it. This allows you to run multiple versions side by side, but also to create multiple processes with the SAME version to allow you to create QOS groups. In the following example, Steampipe will create 2 plugin processes: - One process has a 2000 MB memory soft limit and no limiters, and contains the `aws_prod_1`, `aws_prod_2`, and `aws_prod_3` connections. - One process has a 500 MB memory soft limit and the `all_requests` limiter, and contains the `aws_dev_1` and `aws_dev_2` connections. ```hcl plugin "aws_high" { memory_max_mb = 2000 source = "aws" } plugin "aws_low" { memory_max_mb = 500 source = "aws" limiter "all_requests" { bucket_size = 100 fill_rate = 100 max_concurrency = 50 } } connection "aws_prod_1" { plugin = plugin.aws_high profile = "prod1" regions = ["*"] } connection "aws_prod_2" { plugin = plugin.aws_high profile = "prod2" regions = ["*"] } connection "aws_prod_3" { plugin = plugin.aws_high profile = "prod3" regions = ["*"] } connection "aws_dev_1" { plugin = plugin.aws_low profile = "dev1" regions = ["*"] } connection "aws_dev_2" { plugin = plugin.aws_low profile = "dev2" regions = ["*"] } ``` Note that the aggregators can only aggregate connections from the single plugin instance for which they are configured. Extending the previous example: ```hcl connection "aws_prod" { plugin = plugin.aws_high type = "aggregator" connections = ["*"] } connection "aws_dev" { plugin = plugin.aws_low type = "aggregator" connections = ["*"]} ``` - The `aws_prod` aggregator will include the `aws_prod_1`, `aws_prod_2`, and `aws_prod_3` connections - The `aws_dev` aggregator will include the `aws_dev_1` and `aws_dev_2` connections You can also run multiple plugin versions side-by-side: ```hcl plugin "aws_latest" { source = "aws" } plugin "aws_0_117_0" { source = "aws@0.117.0" } connection "aws_prod_1" { plugin = plugin.aws_latest profile = "prod1" regions = ["*"] } connection "aws_prod_2" { plugin = plugin.aws_0_117_0 profile = "prod2" regions = ["*"] } ``` ## limiter Limiters provide a simple, flexible interface to implement client-site rate limiting and concurrency thresholds. You can use limiters to: - Smooth the request rate from steampipe to reduce load on the remote API or service - Limit the number of parallel request to reduce contention for client and network resources - Avoid hitting server limits and throttling ### Supported options | Argument | Default | Description |-------------------|-----------|-------------------- | `bucket_size` | unlimited | The maximum number of requests that may be made per second (the burst size). Used in combination with `fill_rate` to implement a token-bucket rate limit. | `fill_rate` | unlimited | The number of requests that are added back to refill the bucket each second. Used in combination with `bucket_size` to implement a token-bucket rate limit. | `max_concurrency` | The maximum number of [List, Get, and Hydrate functions](/docs/develop/writing-plugins#hydrate-functions) that can run in parallel. | `scope` | `[]` | The context for the limit - which resources are subject to / counted against the limit. If no scope is specified, then the limiter applies to all functions in the plugin. If you specify a list of scopes, then *a limiter instance is created for each unique combination of scope values* - it acts much like `group by` in a sql statement. | `where` | none | A `where` clause to further filter the scopes to specific values. ### `where` syntax The `where` argument supports the following PostgreSQL comparison operators: | Operator | Description |----------|-------------------------------- | `<` | less than | `<=` | less than or equal | `=` | equal | `!=` | not equal | `<>` | not equal | `>=` | greater than or equal | `>` | greater than | `like` | string like (case sensitive) | `ilike` | string like (case insensitive) | `is null`| null test | `not` | logical negation | `and` | logical conjunction | `or` | logical disjunction | `in` | set membership (equality) You may use parentheses to force explicit lexical precedence, otherwise [standard PostgreSQL operator precedence](https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-PRECEDENCE) applies. ## Examples See the [Concurrency & Rate Limiting](/docs/guides/limiter) for more examples. ```hcl plugin "aws" { # up to 250 functions concurrently across all connections limiter "aws_global_concurrency" { max_concurrency = 250 } # up to 1000 functions per second in us-east-1 for each connection limiter "aws_rate_limit_us_east_1" { bucket_size = 1000 fill_rate = 1000 scope = ["connection", "region"] where = "region = 'us-east-1'" } # up to 200 functions per second in regions OTHER than us-east-1 # for each connection limiter "aws_rate_limit_non_us_east_1" { bucket_size = 200 fill_rate = 200 scope = ["connection", "region"] where = "region <> 'us-east-1'" } } ``` --- --- title: workspace sidebar_label: workspace --- # workspace A Steampipe `workspace` is a "profile" that allows you to define a unified environment that the Steampipe client can interact with. Each workspace is composed of a single Steampipe database instance as well as other context-specific settings and options. Workspace configurations can be defined in any `.spc` file in the `~/.steampipe/config` directory, but by convention they are defined in `~/.steampipe/config/workspaces.spc` file. This file may contain multiple workspace definitions that can then be referenced by name. Steampipe workspaces allow you to define multiple named configurations and easily switch between them using the `--workspace` argument or `STEAMPIPE_WORKSPACE` environment variable. ```hcl workspace "local" { workspace_database = "local" } workspace "acme_prod" { workspace_database = "acme/prod" query_timeout = 600 } ``` To learn more, see **[Managing Workspaces →](/docs/managing/workspaces)** ## Workspace Arguments Many of the workspace arguments correspond to CLI flags and/or environment variables. Any unset arguments will assume the default values. | Argument | Default | Description |---------------------|-----------------------------------------------|----------------------------------------- | `base` | | A reference to a named workspace resource that this workspace should source its definition from. Any argument can be overridden after sourcing via base. | `cache` | `true` | Enable/disable caching. Note that is a **client** setting - if the database (`options "database"`) has the cache disabled, then the cache is disabled regardless of the workspace setting.

Env: [STEAMPIPE_CACHE](/docs/reference/env-vars/steampipe_cache) | `cache_ttl` | `300` | Set the client query cache expiration (TTL) in seconds. Note that is a **client** setting - if the database `cache_max_ttl` is lower than the `cache_ttl` in the workspace, then the effective ttl for this workspace is the `cache_max_ttl`.

Env: [STEAMPIPE_CACHE_TTL](/docs/reference/env-vars/steampipe_cache_ttl) | `install_dir` | `~/.steampipe` | The directory in which the Steampipe database, plugins, and supporting files can be found.

Env: [STEAMPIPE_INSTALL_DIR](/docs/reference/env-vars/steampipe_install_dir)
CLI: `--install-dir` | `options` | | An options block to set command-specific options for this workspace. [Query](#steampipe-query-options), [check](#steampipe-check-options), and [dashboard](#steampipe-dashboard-options) options are supported. | `pipes_host` | `pipes.turbot.com` | Set the Turbot Pipes host for connecting to Turbot Pipes workspace.

Env: [PIPES_HOST](/docs/reference/env-vars/pipes_host)
CLI: `--pipes-host` | `pipes_token` | The token obtained by `steampipe login` | Set the Turbot Pipes authentication token for connecting to a Turbot Pipes workspace. This may be a token obtained by `steampipe login` or a user-generated [token](https://turbot.com/pipes/docs/profile#tokens).

Env: [PIPES_TOKEN](/docs/reference/env-vars/pipes_token)
CLI: `--pipes-token` | `progress` | `true` | Enable or disable progress information.

CLI: `--progress` | `query_timeout` | `240` for controls, unlimited otherwise | The maximum time (in seconds) a query is allowed to run before it times out.

Env: [STEAMPIPE_QUERY_TIMEOUT](/docs/reference/env-vars/steampipe_query_timeout)
CLI: `--query_timeout` | `search_path` | `public`, then alphabetical | A comma-separated list of connections to use as a custom search path for the control run. See also: [Using search_path to target connections and aggregators](https://steampipe.io/docs/guides/search-path).

CLI: `--search-path` | `search_path_prefix`| | A comma-separated list of connections to use as a prefix to the current search path for the control run. See also: [Using search_path to target connections and aggregators](https://steampipe.io/docs/guides/search-path).

CLI: `--search-path-prefix` | `snapshot_location` | The Turbot Pipes user's personal workspace | Set the Turbot Pipes workspace or filesystem path for writing snapshots.

Env: [STEAMPIPE_SNAPSHOT_LOCATION](/docs/reference/env-vars/steampipe_snapshot_location)
CLI: `--snapshot-location` | `workspace_database`| `local` | Workspace database. This can be local or a remote Turbot Pipes database.

Env: [STEAMPIPE_WORKSPACE_DATABASE](/docs/reference/env-vars/steampipe_workspace_database)
CLI: `--workspace-database` ### Steampipe Query Options A `workspace` may include an `options "query"` block to specify values specific to the `steampipe query` command. These options often correspond to CLI flags.
Argument Default Description
`autocomplete` `true` Enable or disable autocomplete in the interactive query shell.
`header` `true` Enable or disable column headers.

CLI: `--header`
`multi` `false` Enable or disable multiline mode.
`output` `table` Set output format (`json`, `csv`, `table`, or `line`).

CLI: `--output`
`separator` `,` Set csv output separator.

CLI: `--separator`
`timing` `off` Enable or disable query execution timing: `off`, `on`, or `verbose`

CLI: `--timing`
## Examples ```hcl workspace "default" { query_timeout = 300 } workspace "all_options" { pipes_host = "pipes.turbot.com" pipes_token = "tpt_999faketoken99999999_111faketoken1111111111111" install_dir = "~/steampipe2" query_timeout = 300 workspace_database = "local" snapshot_location = "acme/dev" search_path = "aws,aws_1,aws_2,gcp,gcp_1,gcp_2,slack,github" search_path_prefix = "aws_all" progress = true cache = true cache_ttl = 300 options "query" { autocomplete = true header = true # true, false multi = false # true, false output = "table" # json, csv, table, line separator = "," # any single char timing = "on" # on, off, verbose } } ``` --- --- title: .autocomplete sidebar_label: .autocomplete --- # .autocomplete Turn autocomplete on or off in the interactive query shell. ## Usage ``` .autocomplete [on | off] ``` ## Examples Turn off autocomplete: ``` .autocomplete off ``` Turn on autocomplete: ``` .autocomplete on ``` --- --- title: .cache sidebar_label: .cache --- # .cache Enable, disable or clear the query cache for this session. ## Usage ``` .cache [on | off | clear] ``` ## Examples Clear the cache ``` .cache clear ``` Turn off caching: ``` .cache off ``` Turn on caching: ``` .cache on ``` --- --- title: .cache_ttl sidebar_label: .cache_ttl --- # .cache Set the query cache TTL (time-to-live) for this session. ## Usage ``` .cache_ttl [integer] ``` ## Examples Set the cache TTL to 15 minutes ``` .cache_ttl 90 ``` --- --- title: .clear sidebar_label: .clear --- # .clear Clear the console screen. ## Usage ``` .clear ``` --- --- title: .connections sidebar_label: .connections --- # .connections List the active connections (schemas). ## Usage ``` .connections ``` --- --- title: .exit sidebar_label: .exit --- # .exit Exit the steampipe interactive query session. ## Usage ``` .exit ``` --- --- --- title: .header sidebar_label: .header --- # .header Turn column headers on or off. ## Usage ``` .header [on | off] ``` ## Examples Turn off column headers: ``` .header off ``` Turn on column headers: ``` .header on ``` --- --- title: .help sidebar_label: .help --- # .help Show help for Steampipe. ## Usage ``` .help ``` --- --- title: .inspect sidebar_label: .inspect --- # .inspect Inspect the available connections, tables, and columns. ## Usage ``` .inspect [connection][.table] ``` ## Examples List all active connections: ``` .inspect ``` List all tables in the aws connection: ``` .inspect aws ``` List the columns in the aws_ec2_instance table: ``` .inspect aws.aws_ec2_instance ``` --- --- title: .multi sidebar_label: .multi --- # .multi Enable or disable multi-line mode. Multi-line mode is off by default, and queries will be executed as soon as you hit the `Enter` key. Enabling multi-line mode mode allows you to write long queries that span multiple lines. The query will not be executed when you press `Enter` unless it ends with a semi-colon. ## Usage ``` .multi [on | off] ``` ## Examples Turn off multi-line mode: ``` .multi off ``` Turn on multi-line mode: ``` .multi on ``` --- --- title: .output sidebar_label: .output --- # .output Change the output mode. By default, the output format is `table` which provides a tabular, human-readable view. You can use the `.output` command to choose a different format. Valid values for this command are `json`, `csv`, `line`, and `table`. ## Usage ``` .output [ table | json | csv | line] ``` ## Examples Change the output mode to json: ``` .output json ``` Change the output mode to csv: ``` .output csv ``` Change the output mode to table: ``` .output table ``` Change the output mode to line: ``` .output line ``` --- --- title: Meta-Commands sidebar_label: Meta-Commands --- # Meta-Commands ## Available Commands | Command | Description |-|- | [.autocomplete](/docs/reference/dot-commands/autocomplete) | Enable or disable autocomplete | [.cache](/docs/reference/dot-commands/cache) | Enable, disable or clear the query cache | [.cache_ttl](/docs/reference/dot-commands/cache_ttl) | Set the query cache TTL (time-to-live) for this session | [.clear](/docs/reference/dot-commands/clear) | Clear the console screen | [.connections](/docs/reference/dot-commands/connections) | List active connections | [.exit](/docs/reference/dot-commands/exit) | Exit from steampipe terminal | [.header](/docs/reference/dot-commands/header) | Enable or disable column headers | [.help](/docs/reference/dot-commands/help) | Show steampipe help | [.inspect](/docs/reference/dot-commands/inspect) | View connection, table & column information | [.multi](/docs/reference/dot-commands/multi) | Enable or disable multiline mode | [.output](/docs/reference/dot-commands/output) | Set output format | [.quit](/docs/reference/dot-commands/quit) | Exit from steampipe terminal | [.search_path](/docs/reference/dot-commands/search_path) | Display the current search path, or set the search_path by passing in a comma-separated list | [.search_path_prefix](/docs/reference/dot-commands/search_path_prefix) | Set a prefix to the current search_path | [.separator](/docs/reference/dot-commands/separator) | Set csv output separator | [.tables](/docs/reference/dot-commands/tables) | List or describe tables | [.timing](/docs/reference/dot-commands/timing) | Enable or disable query execution timing --- --- title: .quit sidebar_label: .quit --- # .quit Exit the steampipe interactive query session. ## Usage ``` .quit ``` --- --- title: .search_path sidebar_label: .search_path --- # .search_path Display the current [search path](/docs/managing/connections#setting-the-search-path), or set the search path by passing in a comma-separated list. ## Usage ``` .search_path [string,string,...] ``` ## Examples show the current search_path: ``` .search_path ``` Set the search path: ``` .search_path aws_prod,aws_dev,gcp_prod,slack,github,shodan ``` --- --- title: .search_path_prefix sidebar_label: .search_path_prefix --- # .search_path_prefix Set a prefix to the current [search path](/docs/managing/connections#setting-the-search-path) by passing in a comma-separated list. ## Usage ``` .search_path_prefix [string,string,...] ``` ## Examples Move the `aws_123456789012` connection to the front of the search path: ``` .search_path_prefix aws_123456789012 ``` Move the `aws_dev` and `gcp_dev` connections to the front of the search path: ``` .search_path_prefix aws_dev,gcp_dev ``` --- --- title: .separator sidebar_label: .separator --- # .separator Set the separator string when the output mode is csv (the default is `,`). ## Usage ``` .separator {character} ``` ## Examples Return output in pipe-separated format: ``` .output csv .separator '|' ``` --- --- title: .tables sidebar_label: .tables --- # .tables List the available tables. ## Usage ``` .tables [connection] ``` ## Examples List all tables in all active connections: ``` .tables ``` List all tables in the `aws` connection: ``` .tables aws ``` --- --- title: .timing sidebar_label: .timing --- # .timing Enable or disable query execution timing: | Level | Description |-----------|------------------------- | `off` | Turn off query timer (default) | `on` | Display time elapsed after every query | `verbose` | Display time elapsed and details of each scan ## Usage ``` .timing [on | off | verbose] ``` ## Examples Turn off query timing: ``` .timing off ``` Turn on query timing: ``` .timing on ``` Turn on verbose query timing: ``` .timing verbose ``` --- --- title: Environment Variables sidebar_label: Environment Variables --- # Environment Variables Steampipe supports environment variables to allow you to change its default behavior. These are optional settings - You are not required to set any environment variables. Note that plugins may also support environment variables, but these are plugin-specific - refer to your plugin's documentation on hub.steampipe.io for details. ## Steampipe Environment Variables | Command | Default | Description |-|-|- | [PIPES_HOST](/docs/reference/env-vars/pipes_host) | `pipes.turbot.com` | Set the Turbot Pipes host, for connecting to Turbot Pipes workspace. | [PIPES_TOKEN](/docs/reference/env-vars/pipes_token) | | Set the Turbot Pipes authentication token for connecting to Turbot Pipes workspace. | [STEAMPIPE_CACHE](/docs/reference/env-vars/steampipe_cache)| `true` | Enable/disable caching. | [STEAMPIPE_CACHE_MAX_SIZE_MB](/docs/reference/env-vars/steampipe_cache_max_size_mb)| unlimited | Set the maximum size of the query cache across all plugins. [DEPRECATED - use `STEAMPIPE_PLUGIN_MEMORY_MAX_MB`]. | [STEAMPIPE_CACHE_MAX_TTL](/docs/reference/env-vars/steampipe_cache_max_ttl)| `300` | The maximum amount of time to cache results, in seconds. | [STEAMPIPE_CACHE_TTL](/docs/reference/env-vars/steampipe_cache_ttl)| `300` | The amount of time to cache results, in seconds. | [STEAMPIPE_DATABASE_PASSWORD](/docs/reference/env-vars/steampipe_database_password)| randomly generated | Set the steampipe database password for this session. This variable must be set when the steampipe service starts. | [STEAMPIPE_DATABASE_SSL_PASSWORD](/docs/reference/env-vars/steampipe_database_ssl_password)| | Set the passphrase used to decrypt the private key for your custom SSL certificate. By default, Steampipe generates a certificate without a passphrase; you only need to set this variable if you use a custom certificate that is protected by a passphrase. | [STEAMPIPE_DATABASE_START_TIMEOUT](/docs/reference/env-vars/steampipe_database_start_timeout)| `30` | Set the maximum time (in seconds) to wait for the Postgres process to start accepting queries after it has been started. | [STEAMPIPE_DIAGNOSTIC_LEVEL](/docs/reference/env-vars/steampipe_diagnostic_level)| `NONE` | Sets the diagnostic level. Supported levels are `ALL`, `NONE`. | [STEAMPIPE_INSTALL_DIR](/docs/reference/env-vars/steampipe_install_dir)| `~/.steampipe` | The directory in which the Steampipe database, plugins, and supporting files can be found. | [STEAMPIPE_LOG](/docs/reference/env-vars/steampipe_log) | `warn` | Set the logging output level [DEPRECATED - use `STEAMPIPE_LOG_LEVEL`]. | [STEAMPIPE_LOG_LEVEL](/docs/reference/env-vars/steampipe_log) | `warn` | Set the logging output level. | [STEAMPIPE_MEMORY_MAX_MB](/docs/reference/env-vars/steampipe_memory_max_mb)| `1024` | Set a soft memory limit for the `steampipe` process. | [STEAMPIPE_OTEL_INSECURE](/docs/reference/env-vars/steampipe_otel_insecure) | `false` | Bypass the SSL/TLS secure connection requirements when connecting to an OpenTelemetry server. | [STEAMPIPE_OTEL_LEVEL](/docs/reference/env-vars/steampipe_otel_level) | `NONE` | Specify which [OpenTelemetry](https://opentelemetry.io/) data to send via OTLP. | [STEAMPIPE_PLUGIN_MEMORY_MAX_MB](/docs/reference/env-vars/steampipe_plugin_memory_max_mb)| `1024` | Set a default memory soft limit for each plugin process. | [STEAMPIPE_QUERY_TIMEOUT](/docs/reference/env-vars/steampipe_query_timeout) | `240` for controls, unlimited in all other cases. | Set the amount of time to wait for a query to complete before timing out, in seconds. | [STEAMPIPE_SNAPSHOT_LOCATION](/docs/reference/env-vars/steampipe_snapshot_location) | The Turbot Pipes user's personal workspace | Set the Turbot Pipes workspace or filesystem path for writing snapshots. | [STEAMPIPE_TELEMETRY](/docs/reference/env-vars/steampipe_telemetry) | `info` | Set the level of telemetry data to collect and send. | [STEAMPIPE_UPDATE_CHECK](/docs/reference/env-vars/steampipe_update_check)| `true` | Enable/disable automatic update checking. | [STEAMPIPE_WORKSPACE](/docs/reference/env-vars/steampipe_workspace) | `default` | Set the Steampipe workspace . This can be named workspace from `workspaces.spc` or a remote Turbot Pipes workspace | [STEAMPIPE_WORKSPACE_DATABASE](/docs/reference/env-vars/steampipe_workspace_database) | `local` | Workspace database. This can be `local` or a remote Turbot Pipes database. --- --- title: PIPES_HOST sidebar_label: PIPES_HOST --- # PIPES_HOST Sets the Turbot Pipes host used when connecting to Turbot Pipes workspaces. The default is `pipes.turbot.com` -- you only need to set this if you are connecting to a remote Turbot Pipes database that is NOT hosted in `pipes.turbot.com`, such as an enterprise tenant instance. Your `PIPES_TOKEN` must be valid for the `PIPES_HOST`. ## Usage Default to use workspaces in `test.steampipe.io`: ```bash export PIPES_HOST=test.turbot.com export PIPES_TOKEN=tpt_c6f5tmpe4mv9appio5rg_3jz0a8fakekeyf8ng72qr646 ``` --- --- title: PIPES_TOKEN sidebar_label: PIPES_TOKEN --- # PIPES_TOKEN Sets the [Turbot Pipes authentication token](https://turbot.com/pipes/docs/profile#tokens). This is used when connecting to Turbot Pipes workspaces. By default, Steampipe will use the token obtained by running `steampipe login`, but you may also set this to a user-generated [API token](https://turbot.com/pipes/docs/profile#tokens). You can manage your API tokens from the **Settings** page for your user account in Turbot Pipes. ## Usage Set your API token: ```bash export PIPES_TOKEN=tpt_c6f5tmpe4mv9appio5rg_3jz0a8fakekeyf8ng72qr646 ``` --- --- title: STEAMPIPE_CACHE sidebar_label: STEAMPIPE_CACHE --- # STEAMPIPE_CACHE Enable or disable automatic caching of results. This can significantly improve performance of some queries, at the expense of data freshness. Caching is enabled by default. Set `STEAMPIPE_CACHE` to `true` to enable caching, or `false` to disable. ## Usage Disable caching: ```bash export STEAMPIPE_CACHE=false ``` Enable caching: ```bash export STEAMPIPE_CACHE=true ``` ## Advanced: Client & Server Caching Note that when connecting to a remote Steampipe database instance, `STEAMPIPE_CACHE` can be set at BOTH the server and the client: - Setting `STEAMPIPE_CACHE` on the host where the Steampipe database is running controls whether caching is enabled at all (the same as setting the `cache` argument in [database options](/docs/reference/config-files/options#database-options)). - Setting `STEAMPIPE_CACHE` on the remote client controls the caching options for this session only (the same as setting the `cache` argument in the [workspace](/docs/reference/config-files/workspace)). - If the server has caching enabled a client may choose to disable it for their session, but if the server disables caching then no client can enable it. --- --- title: STEAMPIPE_CACHE_MAX_SIZE_MB sidebar_label: STEAMPIPE_CACHE_MAX_SIZE_MB --- # STEAMPIPE_CACHE_MAX_SIZE_MB ***`STEAMPIPE_CACHE_MAX_SIZE_MB` is deprecated and will be removed in a future version of Steampipe. Use `STEAMPIPE_PLUGIN_MEMORY_MAX_MB` to manage memory limits for plugins*** Set the maximum size (in MB) of the query cache across all plugins. If `STEAMPIPE_CACHE_MAX_SIZE_MB` is set, Steampipe will limit the query cache ***across all plugins*** to the specified size. Each plugin version runs in a separate process, and each plugin process has its own cache. When `STEAMPIPE_CACHE_MAX_SIZE_MB` is set, Steampipe divides the cache based on the total number of connections and allocates memory shares to each plugin process based on the number of connections for that plugin. For example, consider the following case: - A Steampipe instance has: - 10 `aws` connections - 5 `azure` connections - 5 `gcp` connections - 1 `github` connection - 1 `rss` connection - 1 `net` connection - 2 `csv` connections - `STEAMPIPE_CACHE_MAX_SIZE_MB` is set to `4000` In this example, there are 25 total connections, so each connection's share is 4000 / 25 = ***160 MB***. The plugin query caches will be capped as follows: | Plugin | Max Cache Size |---------|----------------- | `aws` | **1600 MB** (10 connections x 160 MB) | `azure` | **800 MB** (5 connections x 160 MB) | `gcp` | **800 MB** (5 connections x 160 MB) | `github`| **160 MB** (1 connection x 160 MB) | `rss` | **160 MB** (1 connection x 160 MB) | `net` | **160 MB** (1 connection x 160 MB) | `csv` | **320 MB** (2 connections x 160 MB By default, Steampipe does not limit the size of the query cache. This is a server setting. When connecting to a remote database, setting `STEAMPIPE_CACHE_MAX_SIZE_MB` on the client will have no effect. `STEAMPIPE_CACHE_MAX_SIZE_MB` only works with plugins compiled with [Steampipe Plugin SDK](https://github.com/turbot/steampipe-plugin-sdk) version 4.0.0 and later. ## Usage Limit cache to 4GB: ```bash export STEAMPIPE_CACHE_MAX_SIZE_MB=4096 ``` Reset caching to unlimited: ```bash unset STEAMPIPE_CACHE_MAX_SIZE_MB ``` --- --- title: STEAMPIPE_CACHE_MAX_TTL sidebar_label: STEAMPIPE_CACHE_MAX_TTL --- # STEAMPIPE_CACHE_MAX_TTL The maximum amount of time to cache query results, in seconds. The default is `300` (5 minutes). Caching must be enabled for this setting to take effect. This is a server setting, not a client setting. When connecting to a Steampipe database, you are subject to the `STEAMPIPE_CACHE_MAX_TTL` set on the server. You can set the [STEAMPIPE_CACHE_TTL](/docs/reference/env-vars/steampipe_cache_ttl) (or `cache_ttl` in a [workspace](/docs/reference/config-files/workspace)) from your client to *reduce* the TTL for your session but not to expand it. The net effect for your session will be the lower of the two values. ## Usage Set the maximum query cache TTL to 10 minutes ```bash export STEAMPIPE_CACHE_MAX_TTL=600 ``` --- --- title: STEAMPIPE_CACHE_TTL sidebar_label: STEAMPIPE_CACHE_TTL --- # STEAMPIPE_CACHE_TTL The amount of time to cache query results for this client, in seconds. The default is `300` (5 minutes). Caching must be enabled for this setting to take effect. This is a client setting. When connecting to a Steampipe database, you are also subject to the [STEAMPIPE_CACHE_MAX_TTL](/docs/reference/env-vars/steampipe_cache_max_ttl) set on the server. You can set the `STEAMPIPE_CACHE_TTL` (or `cache_ttl` in a [workspace](/docs/reference/config-files/workspace)) from your client to *reduce* the TTL for your session but not to expand it. The net effect for your session will be the lower of the two values. ## Usage Set TTL to 1 minute ```bash export STEAMPIPE_CACHE_TTL=60 ``` --- --- title: STEAMPIPE_DATABASE_PASSWORD sidebar_label: STEAMPIPE_DATABASE_PASSWORD --- # STEAMPIPE_DATABASE_PASSWORD Sets the Steampipe database password for this session. By default, steampipe creates a random, unique password for the `steampipe` user. To use a different password, set the `STEAMPIPE_DATABASE_PASSWORD` variable and start the steampipe service. Note the following: - Steampipe sets the `steampipe` user password when the database starts, thus this variable must be set when the steampipe service starts. - If the `--database-password` is passed to `steampipe service start`, it will override this environment variable. - Setting `STEAMPIPE_DATABASE_PASSWORD` (or passing the `--database-password` argument) sets the password for the current service instance only - it does not permanently change the steampipe password. You can permanently change the default password by editing the `~/.steampipe/internal/.passwd`. Deleting this file will result in a new random password being generated the next time Steampipe starts. - Both `steampipe` and `root` can login from the local host ([`samehost` in the `pg_hba.conf` file](https://www.postgresql.org/docs/14/auth-pg-hba-conf.html)) without a password, regardless of the `STEAMPIPE_DATABASE_PASSWORD` value. ## Usage Start the steampipe service with a custom password: ```bash export STEAMPIPE_DATABASE_PASSWORD=MyPassword123 steampipe service start ``` --- --- title: STEAMPIPE_DATABASE_SSL_PASSWORD sidebar_label: STEAMPIPE_DATABASE_SSL_PASSWORD --- # STEAMPIPE_DATABASE_SSL_PASSWORD Sets the `server.key` passphrase. By default, Steampipe generates a certificate without a passphrase; you only need to set this variable if you use a custom certificate that is protected by a passphrase. To use a custom certificate with a passphrase: - `STEAMPIPE_DATABASE_SSL_PASSWORD` must be set when you start Steampipe. - The `server.key` content **must** contain [Proc-Type](https://datatracker.ietf.org/doc/html/rfc1421#section-4.6.1.1) and [DEK-Info](https://datatracker.ietf.org/doc/html/rfc1421#section-4.6.1.3) headers. ## Usage Start the Steampipe service with a custom password: ```bash export STEAMPIPE_DATABASE_SSL_PASSWORD=MyPassPhrase steampipe service start ``` --- --- title: STEAMPIPE_DATABASE_START_TIMEOUT sidebar_label: STEAMPIPE_DATABASE_START_TIMEOUT --- # STEAMPIPE_DATABASE_START_TIMEOUT The maximum time (in seconds) to wait for the Postgres process to start accepting queries after it has been started. The default is `30`. Note that `STEAMPIPE_DATABASE_START_TIMEOUT` is ignored if the database goes into recovery mode upon startup -- Steampipe will wait indefinitely for recovery to complete. This can be cancelled with `Ctrl+C`. ## Usage Set database start timeout to 5 minutes: ```bash export STEAMPIPE_DATABASE_START_TIMEOUT=300 ``` --- --- title: STEAMPIPE_DIAGNOSTIC_LEVEL sidebar_label: STEAMPIPE_DIAGNOSTIC_LEVEL --- # STEAMPIPE_DIAGNOSTIC_LEVEL Sets the diagnostic level. Supported levels are `ALL`, `NONE`. By default, the diagnostic level is `NONE`. ## Usage ```bash export STEAMPIPE_DIAGNOSTIC_LEVEL=ALL ``` When enabled, diagnostics information will appear in the `_ctx` column for all tables: ```sql > select jsonb_pretty(_ctx),display_name from aws_sns_topic limit 1 +-----------------------------------------------------------+--------------+ | jsonb_pretty | display_name | +-----------------------------------------------------------+--------------+ | { | | | "connection": "aws_dev_01", | | | "diagnostics": { | | | "calls": [ | | | { | | | "type": "list", | | | "scope_values": { | | | "table": "aws_sns_topic", | | | "action": "ListTopics", | | | "region": "us-east-1", | | | "service": "sns", | | | "connection": "aws_dev_01" | | | }, | | | "function_name": "listAwsSnsTopics", | | | "rate_limiters": [ | | | "sns_list_topics", | | | "aws_global_concurrency" | | | ], | | | "rate_limiter_delay_ms": 0 | | | }, | | | { | | | "type": "hydrate", | | | "scope_values": { | | | "table": "aws_sns_topic", | | | "action": "GetTopicAttributes", | | | "region": "us-east-1", | | | "service": "sns", | | | "connection": "aws_dev_01" | | | }, | | | "function_name": "getTopicAttributes", | | | "rate_limiters": [ | | | "sns_get_topic_attributes_us_east_1", | | | "aws_global_concurrency" | | | ], | | | "rate_limiter_delay_ms": 107 | | | } | | | ] | | | } | | | } | | ``` The diagnostics information includes information about each Get, List, and Hydrate function that was called to fetch the row, including: | Key | Description |-------------------------|---------------------- | `type` | The type of function (`list`, `get`, or `hydrate`). | `function_name` | The name of the function. | `scope_values` | A map of scope names to values. This includes the built-in scopes as well as any matrix qualifier scopes and function tags. | `rate_limiters` | A list of the rate limiters that are scoped to the function. | `rate_limiter_delay_ms` | The amount of time (in milliseconds) that Steampipe waited before calling this function due to client-side (`limiter`) rate limiting. --- --- title: STEAMPIPE_INSTALL_DIR sidebar_label: STEAMPIPE_INSTALL_DIR --- # STEAMPIPE_INSTALL_DIR Sets the directory for the Steampipe installation, in which the Steampipe database, plugins, and supporting files can be found. Steampipe is distributed as a single binary - when you install Steampipe, either via `brew install` or via the `curl` script, the `steampipe` binary is installed into your path. The first time that you run Steampipe, it will download and install the embedded database, foreign data wrapper extension, and other required files. By default, these files are installed to `~/.steampipe`, however you can change this location with the `STEAMPIPE_INSTALL_DIR` environment variable or the `--install-dir` command line argument. Steampipe will read the `STEAMPIPE_INSTALL_DIR` variable each time it runs; if it's not set, Steampipe will use the default path (`~/.steampipe`). If you wish to **ALWAYS** run Steampipe from the alternate path, you should set your environment variable in a way that will persist across sessions (in your `.profile` for example). To install a new Steampipe instance into an alternate path, simply specify the path in `STEAMPIPE_INSTALL_DIR` and then run a `steampipe` command (alternatively, use the `--install-dir` argument). If the specified directory is empty or files are missing, Steampipe will install and update the database and files to `STEAMPIPE_INSTALL_DIR`. If the directory does not exist, Steampipe will create it. It is possible to have multiple, parallel steampipe instances on a given machine using `STEAMPIPE_INSTALL_DIR` as long as they are running on a different port. ## Usage Set the STEAMPIPE_INSTALL_DIR to `~/mypath`. You will likely want to set this in your `.profile`. ```bash export STEAMPIPE_INSTALL_DIR=~/mypath ``` --- --- title: STEAMPIPE_LOG_LEVEL sidebar_label: STEAMPIPE_LOG_LEVEL --- # STEAMPIPE_LOG_LEVEL Sets the output logging level. Standard log levels are supported (`TRACE`, `DEBUG`, `INFO`, `WARN`, `ERROR`). By default, the log level is `WARN`. Logs are written to `~/.steampipe/logs/` ## Usage ```bash export STEAMPIPE_LOG_LEVEL=TRACE ``` --- --- title: STEAMPIPE_MEMORY_MAX_MB sidebar_label: STEAMPIPE_MEMORY_MAX_MB --- # STEAMPIPE_MEMORY_MAX_MB Set a soft memory limit for the `steampipe` process. Steampipe sets `GOMEMLIMIT` for the `steampipe` process to the specified value. The Go runtime does not guarantee that the memory usage will not exceed the limit, but rather uses it as a target to optimize garbage collection. Set the `STEAMPIPE_MEMORY_MAX_MB` to `0` disable the soft memory limit. ## Usage Set the memory soft limit to 2GB: ```bash export STEAMPIPE_MEMORY_MAX_MB=2048 ``` Disable the memory soft limit: ```bash export STEAMPIPE_MEMORY_MAX_MB=0 ``` --- --- title: STEAMPIPE_OTEL_INSECURE sidebar_label: STEAMPIPE_OTEL_INSECURE --- # STEAMPIPE_OTEL_INSECURE Set the `STEAMPIPE_OTEL_INSECURE` to bypass the default secure connection requirements when connecting to an OpenTelemetry server. This enables steampipe to communicate with the OpenTelemetry server without needing SSL/TLS encryption. This can be useful for local testing or when operating within a secure, isolated network where encryption may not be deemed necessary. ## Usage If you are connecting to a local insecure OpenTelemetry server: ```bash export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:16686/ export STEAMPIPE_OTEL_INSECURE=true ``` --- --- title: STEAMPIPE_OTEL_LEVEL sidebar_label: STEAMPIPE_OTEL_LEVEL --- # STEAMPIPE_OTEL_LEVEL Specify which [OpenTelemetry](https://opentelemetry.io/) data to send via OTLP. Accepted values are: | Level | Description |-|- | `ALL` | Send Metrics and Traces via OTLP | `METRICS` | Send Metrics via OTLP | `NONE` | Do not send OpenTelemetry data (default) | `TRACE` | Send Traces via OTLP Steampipe is instrumented with the Open Telemetry SDK which supports the [standard SDK environment variables](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/sdk-environment-variables.md). If `OTEL_EXPORTER_OTLP_ENDPOINT` is not specified, Steampipe will default to `localhost:4317`. Currently, Steampipe only supports OTLP/gRPC. The Steampipe Plugin SDK added OpenTelemetry support in version 3.3.0 - plugins must be compiled with `v3.3.0` or later. ## Usage Trace a single query and send to default endpoint (`localhost:4317`): ```bash STEAMPIPE_OTEL_LEVEL=TRACE steampipe query "select * from aws_iam_role" ``` Turn on Metrics and Tracing for the session: ```bash export OTEL_EXPORTER_OTLP_ENDPOINT=my-otel-collector.mydomain.com:4317 export STEAMPIPE_OTEL_LEVEL=ALL steampipe query ``` --- --- title: STEAMPIPE_PLUGIN_MEMORY_MAX_MB sidebar_label: STEAMPIPE_PLUGIN_MEMORY_MAX_MB --- # STEAMPIPE_PLUGIN_MEMORY_MAX_MB Set a default memory soft limit for each plugin process. Steampipe sets `GOMEMLIMIT` for each plugin process to the specified value. The Go runtime does not guarantee that the memory usage will not exceed the limit, but rather uses it as a target to optimize garbage collection. Note that each plugin can have its own `memory_max_mb` set in [a `plugin` definition](/docs/reference/config-files/plugin), and that value would override this default setting. Set the `STEAMPIPE_PLUGIN_MEMORY_MAX_MB` to `0` disable the default soft memory limit. ## Usage Set the default plugin memory soft limit to 2GB: ```bash export STEAMPIPE_PLUGIN_MEMORY_MAX_MB=2048 ``` Disable the default plugin memory soft limit: ```bash export STEAMPIPE_PLUGIN_MEMORY_MAX_MB=0 ``` --- --- title: STEAMPIPE_QUERY_TIMEOUT sidebar_label: STEAMPIPE_QUERY_TIMEOUT --- # STEAMPIPE_QUERY_TIMEOUT The amount of time to wait for a query to complete before timing out, in seconds. Set to `0` to disable the query timeout. The default is `0`. ## Usage Set query timeout to 2 minutes ```bash export STEAMPIPE_QUERY_TIMEOUT=120 ``` Disable the query timeout: ```bash export STEAMPIPE_QUERY_TIMEOUT=0 ``` Reset query timeout to the default ```bash unset STEAMPIPE_QUERY_TIMEOUT ``` --- --- title: STEAMPIPE_SNAPSHOT_LOCATION sidebar_label: STEAMPIPE_SNAPSHOT_LOCATION --- # STEAMPIPE_SNAPSHOT_LOCATION Sets the location to write [snapshots](/docs/query/snapshots) - either a local file path or a [Turbot Pipes workspace](https://turbot.com/pipes/docs/workspaces). By default, Steampipe will write snapshots to your default Turbot Pipes user workspace. ## Usage Set the snapshot location to a local filesystem path: ```bash export STEAMPIPE_SNAPSHOT_LOCATION=~/my-snaps ``` Set the snapshot location to a Turbot Pipes workspace: ```bash export STEAMPIPE_SNAPSHOT_LOCATION=vandelay-industries/latex ``` --- --- title: STEAMPIPE_TELEMETRY sidebar_label: STEAMPIPE_TELEMETRY --- # STEAMPIPE_TELEMETRY By default, Steampipe collects usage information to help assess features, usage patterns, and bugs. This information helps us improve and optimize the Steampipe experience. We do not collect any sensitive information such as secrets, environment variables or file contents. We do not share your data with anyone. Current options are: - `none`: do not collect or send any telemetry data. - `info`: send basic information such as which plugins are installed, what mods are used, how and when steampipe is started and stopped. **If you are connecting to [Turbot Pipes](https://turbot.com/pipes/docs)**, we also include the following: - actor id - actor handle - identity id - identity handle - identity type - workspace id ## Usage Disable telemetry data: ```bash export STEAMPIPE_TELEMETRY=none ``` Enable telemetry data at `info` level (this is the default) ```bash export STEAMPIPE_TELEMETRY=info ``` --- --- title: STEAMPIPE_UPDATE_CHECK sidebar_label: STEAMPIPE_UPDATE_CHECK --- # STEAMPIPE_UPDATE_CHECK Enable or disable automatic update checking. Update checking is enabled by default. Set to `false` to disable update checking. ## Usage Disable update check: ```bash export STEAMPIPE_UPDATE_CHECK=false ``` Enable update check: ```bash export STEAMPIPE_UPDATE_CHECK=true ``` or ```bash unset STEAMPIPE_UPDATE_CHECK ``` --- --- title: STEAMPIPE_WORKSPACE sidebar_label: STEAMPIPE_WORKSPACE --- # STEAMPIPE_WORKSPACE Sets the Steampipe [workspace](/docs/reference/config-files/workspace). A Steampipe `workspace` is a "profile" that allows you to define a unified environment that the Steampipe client can interact with. To learn more, see **[Managing Workspaces →](/docs/managing/workspaces)** ## Usage Use the `my_workspace` workspace: ```bash export STEAMPIPE_WORKSPACE=my_workspace ``` Use the `acme/prod` Turbot Pipes workspace: ```bash export STEAMPIPE_WORKSPACE=acme/prod ``` --- --- title: STEAMPIPE_WORKSPACE_DATABASE sidebar_label: STEAMPIPE_WORKSPACE_DATABASE --- # STEAMPIPE_WORKSPACE_DATABASE Sets the database that Steampipe will connect to. By default, Steampipe will use the locally installed database (`local`). Alternately, you can use a remote database such as a Turbot Pipes workspace database. ## Usage Use a Steampipe cloud remote database: ```bash export PIPES_TOKEN=tpt_c6f5tmpe4mv9appio5rg_3jz0a8fakekeyf8ng72qr646 export STEAMPIPE_WORKSPACE_DATABASE=acme/prod ``` Use a remote postgres database via connection string: ```bash export STEAMPIPE_WORKSPACE_DATABASE=postgresql://myusername:mypassword@acme-prod.apse1.db.cloud.turbot.io:9193/aaa000 ``` --- --- title: Steampipe Reference sidebar_label: Reference --- # Steampipe Reference When all else fails, read the manual... - **[Command line reference →](/docs/reference/cli/overview)** - **[Config File reference →](/docs/reference/config-files/overview)** - **[Meta-Commands reference →](/docs/reference/dot-commands/overview)** - **[Environment Variables reference →](/docs/reference/env-vars/overview)** --- --- title: Querying IP Addresses sidebar_label: Querying IP Addresses --- # Querying IP Addresses One of the primary uses of Steampipe is for auditing cloud and network infrastructure. As such, many columns store IP addresses or network addresses in CIDR format. Steampipe leverages the native [Postgres inet and cidr data types](https://www.postgresql.org/docs/14/datatype-net-types.html) for IP addresses and cidr ranges. The essential difference between `inet` and `cidr` data types is that `inet` accepts values with nonzero bits to the right of the netmask, whereas `cidr` does not; `inet` columns can either be a single IP address OR a CIDR range, but `cidr` MUST be a CIDR range. You can use the standard [Postgres network address functions and operators](https://www.postgresql.org/docs/14/functions-net.html) with Steampipe. You can **extract the host, network, netmask, and broadcast addresses** from a CIDR: ```sql select vpc_id, cidr_block, host(cidr_block), broadcast(cidr_block), netmask(cidr_block), network(cidr_block) from aws_vpc; ``` You can find IP addresses that **match exactly**: ```sql select title, private_ip_address, public_ip_address from aws_ec2_instance where private_ip_address = '172.31.52.163'; ``` or find IPs that are **contained within a given CIDR range**: ```sql select title, private_ip_address, public_ip_address from aws_ec2_instance where private_ip_address <<= '172.16.0.0/12'; ``` or test whether a **CIDR contains an address**: ```sql select title, cidr_block from aws_vpc_subnet where cidr_block >> '172.31.52.163'; ``` Of course you can use 'not' to look for IP addresses that are NOT in a range as well: ```sql select vpc_id, cidr_block, state, region from aws_vpc where not cidr_block <<= '10.0.0.0/8' and not cidr_block <<= '192.168.0.0/16' and not cidr_block <<= '172.16.0.0/12'; ``` You can even **join tables** where an address from one table is contained in the network of another: ```sql select i.title as instance, i.private_ip_address, s.title as subnet, s.cidr_block from aws_ec2_instance as i join aws_vpc_subnet as s on i.private_ip_address <<= s.cidr_block; ``` This works for networks as well - you can **test whether one CIDR is contained entirely in another**: ```sql select title as subnet, cidr_block from aws_vpc_subnet where cidr_block <<= '10.0.0.0/8'; ``` --- --- title: Querying JSON sidebar_label: Querying JSON --- # Querying JSON Steampipe plugins call API functions, and quite often these functions return structured object data, most commonly in json or yaml. As a result, json columns are very common in steampipe. Fortunately, PostgreSQL has native support for json. Steampipe stores json columns using the [jsonb](https://www.postgresql.org/docs/14/datatype-json.html) datatype, and you can use the standard [Postgres JSON functions and operators](https://www.postgresql.org/docs/14/functions-json.html) with Steampipe. To return the **full json** column, you can simply select it like any other column: ```sql select title, policy from aws_s3_bucket; ``` You can make the json more **readable** with `jsonb_pretty`: ```sql select title, jsonb_pretty(policy) from aws_s3_bucket; ``` You can **extract objects** from json columns using jsonb `->` operator: ```sql select name, acl -> 'Owner' as owner from aws_s3_bucket; ``` Alternatively you can use [array-style subscripting](https://www.postgresql.org/docs/14/datatype-json.html#JSONB-SUBSCRIPTING) with Steampipe 0.14 and later: ```sql select name, acl['Owner'] as owner from aws_s3_bucket; ``` You can **extract text** from json columns using jsonb `->>` operator: ```sql select title, tags ->> 'Name' as name, tags ->> 'application' as application, tags ->> 'owner' as owner from aws_ebs_snapshot; ``` Array subscripting ALWAYS returns jsonb though, so if you want text (similar to `->>`) you will have to extract it: ```sql select title, tags['Name'] #>> '{}' as name from aws_ebs_snapshot; ``` You can get **text from nested objects** with arrow operators: ```sql select name, acl -> 'Owner' ->> 'ID' as acl_owner_id, acl -> 'Owner' ->> 'DisplayName' as acl_owner from aws_s3_bucket; ``` or using array subscripting: ```sql select name, acl['Owner']['ID'] #>> '{}' as acl_owner_id, acl['Owner']['DisplayName'] #>> '{}' as acl_owner from aws_s3_bucket; ``` or even combining array subscripting with arrow operators: ```sql select name, acl['Owner'] ->> 'ID' as acl_owner_id, acl['Owner'] ->> 'DisplayName' as acl_owner from aws_s3_bucket; ``` You can **use jsonpath** to extract or filter data if you prefer: ```sql select name, jsonb_path_query(acl, '$.Owner.ID') as acl_owner_id, jsonb_path_query(acl, '$.Owner.DisplayName') as acl_owner from aws_s3_bucket; ``` You can **filter, sort, and group** your data using the arrow operators as well: ```sql select tags ->> 'application' as application, count(*) as count from aws_ebs_snapshot where tags ->> 'application' is not null group by application order by application asc; ``` You can **count** the number of items in a json array: ```sql select vpc_endpoint_id, jsonb_array_length(subnet_ids) as subnet_id_count from aws_vpc_endpoint; ``` You can **enumerate json arrays** and extract data from each element: ```sql select snapshot_id, volume_id, jsonb_array_elements(create_volume_permissions) as perm from aws.aws_ebs_snapshot; ``` And even **extract items within nested json** in the arrays: ```sql select snapshot_id, volume_id, jsonb_array_elements(create_volume_permissions) ->> 'UserId' as account_id from aws.aws_ebs_snapshot; ``` --- --- title: It's Just SQL! sidebar_label: It's Just SQL! --- # It's Just SQL! Steampipe leverages PostgreSQL Foreign Data Wrappers to provide a SQL interface to external services and systems. Steampipe uses an embedded PostgreSQL database (currently, version 14.2.0), and you can use [standard Postgres syntax](https://www.postgresql.org/docs/14/sql.html) to query Steampipe. ## Basic SQL Like most popular relational databases, Postgres complies with the ANSI SQL standard - If you know SQL, you already know how to query Steampipe! You can **query all the columns** in a table: ```sql select * from aws_ec2_instance; ``` This is inefficient though -- you should **only query the columns that you need**. This will save Steampipe from making API calls to gather data that you don't want anyway: ```sql select instance_id, instance_type, instance_state from aws_ec2_instance; ``` You can **filter** rows where columns only have a specific value: ```sql select instance_id, instance_type, instance_state from aws_ec2_instance where instance_type = 't2.small'; ``` or a **range** of values: ```sql select instance_id, instance_type, instance_state from aws_ec2_instance where instance_type in ('t2.small', 't2.micro'); ``` or match a **pattern**: ```sql select instance_id, instance_type, instance_state from aws_ec2_instance where instance_type like '%small'; ``` You can **filter on multiple columns**, joined by `and` or `or`: ```sql select instance_id, instance_type, instance_state from aws_ec2_instance where instance_type = 't2.small' and instance_state = 'stopped'; ``` You can **sort** your results: ```sql select name, runtime, memory_size from aws_lambda_function order by runtime; ``` You can **sort on multiple columns, ascending or descending**: ```sql select name, runtime, memory_size from aws_lambda_function order by runtime asc, memory_size desc; ``` You can group and use standard aggregate functions. You can **count** results: ```sql select runtime, count(*) from aws_lambda_function group by runtime order by count desc; ``` or **sum** them: ```sql select runtime, sum(memory_size) from aws_lambda_function group by runtime; ``` or find **min**, **max**, and **average**: ```sql select runtime, min(memory_size), max(memory_size), avg(memory_size) from aws_lambda_function group by runtime; ``` You can **exclude duplicate rows**: ``` select distinct instance_type from aws_all.aws_ec2_instance_type ``` or exclude **all but one matching row**: ``` select distinct on (name) name, log_group_name from aws_all.aws_cloudwatch_log_stream ``` Of course the real power of SQL is in combining data from multiple tables! You can **join tables** together on a key field. When doing so, you may need to alias the tables (with `as`) to disambiguate them: ```sql select instance.instance_id, instance.subnet_id, subnet.availability_zone from aws_ec2_instance as instance join aws_vpc_subnet as subnet on instance.subnet_id = subnet.subnet_id; ``` You can use outer joins (left, right, or full) when you want to **find non-matching** rows as well. For example to see all your volumes and the number snapshots from them: ```sql select v.volume_id, count(s.snapshot_id) as snapshot_count from aws_ebs_volume as v left join aws_ebs_snapshot as s on v.volume_id = s.volume_id group by v.volume_id; ``` or to find snapshots from volumes that no longer exist: ```sql select s.snapshot_id, s.volume_id from aws_ebs_volume as v right join aws_ebs_snapshot as s on v.volume_id = s.volume_id where v.volume_id is null; ``` You can use union queries to **combine datasets**. Note that `union all` is much more efficient if you don't need to eliminate duplicate rows. ```sql select name, arn, account_id from aws_iam_role union all select name, arn, account_id from aws_iam_user union all select name, arn, account_id from aws_iam_group; ``` --- --- title: Tips and Tricks sidebar_label: Tips and Tricks --- # Tips and Tricks ## Select only the columns that you need. This is a common recommendation for any SQL database, but it is especially important for Steampipe, as it can avoid making API calls to gather data that you don't want anyway. The difference in execution time varies by table and environment, but can be quite significant. For example, in a test account: `select * from aws_iam_policy;` took 14 seconds to execute, but this call took less than a second: ```sql select name, arn, is_aws_managed from aws_iam_policy; ``` The exception to this rule is the `count` aggregate function - Steampipe will optimize it, thus this call is very efficient (and also takes less than a second): ```sql select count(*) from aws_iam_policy; ``` ## Limit results with a `where =` clause on key columns when possible. The Steampipe FDW can be more efficient if your query specifies the key columns exactly in a `where` clause. For example: ```sql select * from aws_ec2_instance where instance_id = 'i-0f16e4805caddfd44'; ``` For non-key columns, data for all rows must be collected, and then filtered. Currently, the only way to know definitively which columns are key columns is in the plugin source file. ## Some tables ***require*** a where or join clause The Steampipe database doesn't store data, it makes API calls to get data dynamically. There are times when listing ALL the elements represented by a table is impossible or prohibitively slow. In such cases, a table may require you to specify a qualifier in a `where =` (or `join...on`) clause. For example, the Github `ListUsers` API will enumerate ALL Github users. It is not reasonable to page through hundreds of thousands of users to find what you are looking for. Instead, Steampipe requires that you specify `where login =` to find the user directly, for example: ```sql select * from github_user where login = 'torvalds'; ``` Alternatively, you join on the key column (`login`) in a `where` or `join` clause: ```sql select u.login, o.login as organization, u.name, u.company, u.location from github_user as u, github_my_organization as o, jsonb_array_elements_text(o.member_logins) as member_login where u.login = member_login; ``` or ```sql select u.login, o.login as organization, u.name, u.company, u.location from github_my_organization as o, jsonb_array_elements_text(o.member_logins) as member_login join github_user as u on u.login = member_login; ``` --- --- title: Install sidebar_label: Install --- # Installing Steampipe Steampipe Export CLIs Each Steampipe plugin is distributed as a distinct Steampipe Export CLI. They are available for download in the **Releases** for the corresponding plugin repo, however it is simplest to install them with the [Steampipe Export CLI install script](https://steampipe.io/install/export.sh): ```bash /bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" ``` The installer will prompt you for the plugin name, version, and destination directory. It will then determine the OS and system architecture, and it will download and install the appropriate package. ```bash Enter the plugin name: aws Enter the version (latest): Enter location (/usr/local/bin): Created temporary directory at /var/folders/t4/1lm46wt12sv7yq1gp1swn3jr0000gn/T/tmp.RpZLlzs2. Downloading steampipe_export_aws.darwin_arm64.tar.gz... ###################################################################################################################################################################### 100.0% Deflating downloaded archive x steampipe_export_aws Installing Applying necessary permissions Removing downloaded archive steampipe_export_aws was installed successfully to /usr/local/bin ``` The installer will find the appropriate package and download it to the specified directory. If you don't want to use the installer, you can download, extract, and install the file yourself. There are downloadable `tar.gz` packages for all platforms available in the **Releases** for the corresponding plugin's Github repo (e.g. https://github.com/turbot/steampipe-plugin-aws/releases). --- --- title: Steampipe Export CLIs sidebar_label: Export CLIs --- # Steampipe Export CLIs Steampipe Export CLIs provide a flexible mechanism for exporting information from cloud services and APIs. Each exporter is a stand-alone binary that allows you to extract data using Steampipe plugins *without a database*. --- --- title: Run sidebar_label: Run --- # Running Steampipe Plugin Exporters Each Steampipe Export CLI is distributed as a separate binary, but the command line options are the same: ```bash Export data using the aws plugin. Find detailed usage information including table names, column names, and examples at the Steampipe Hub: https://hub.steampipe.io/plugins/turbot/aws Usage: steampipe_export_aws TABLE_NAME [flags] Flags: --config string Inline HCL config data for the connection (deprecated - use --connection instead) --config-dir string Directory to read config files from (defaults to $STEAMPIPE_INSTALL_DIR/config) --connection string Name of the connection to use (must match a connection defined in the config file) -h, --help Help for steampipe_export_aws --limit int Maximum number of rows to return (0 means no limit) --output string Output format: csv, json or jsonl (default "csv") --select strings Columns to include in the output -v, --version Version for steampipe_export_aws ``` ## Configuration Many plugins have a *default* configuration that will use environment variables or other "native" configuration files to set your credentials if you don’t provide a `--config` or a `--connection`. The behavior varies by plugin but should be documented in the [Steampipe hub](https://hub.steampipe.io/plugins). The AWS plugin, for example, will resolve the region and credentials using the same mechanism as the AWS CLI (AWS environment variables, default profile, etc). If you have AWS CLI default credentials set up, Steampipe will use them if you don't specify `--config` or `--connection`: ```bash steampipe_export_aws aws_account ``` There are a few different ways to configure the Exporters: 1. You can specify the configuration with the `--config` argument. The `--config` argument takes a string containing the HCL configuration options for the plugin. The options vary per plugin, and match the [connection](https://steampipe.io/docs/managing/connections) options for the corresponding plugin. You can view the available options and syntax for the plugin in the [Steampipe hub](https://hub.steampipe.io/plugins). This has been deprecated. ```bash steampipe_export_aws --config 'profile = "my_profile"' aws_account ``` Note that HCL is newline-sensitive and you must include the line break. You can use `\n` with the [bash `$’string’` syntax](https://www.gnu.org/software/bash/manual/html_node/ANSI_002dC-Quoting.html#ANSI_002dC-Quoting) to accomplish this: ```bash steampipe_export_aws --config $'access_key="AKIA4YFAKEKEYT99999" \n secret_key="A32As+zuuBFThisIsAFakeSecretNb77HSLmcB"' aws_account ``` Or you can write your config to a file: ```hcl access_key = "AKIA4YFAKEKEYT99999" secret_key = "A32As+zuuBFThisIsAFakeSecretNb77HSLmcB" ``` And then `cat` the file into the `--config` arg: ```bash steampipe_export_aws --config "$(cat my_aws_config.hcl)" aws_account ``` 2. Alternatively, you can use a named connection with the `--connection` argument. The `--connection` argument allows you to specify the name of a Steampipe connection defined in a `.spc` config file. This is the preferred method for configuring your export tool. By default, the exporter will look for the config files in the Steampipe install directory `($STEAMPIPE_INSTALL_DIR/config)`, but you can override this path with the `--config-dir` argument. ```bash steampipe_export_aws --connection aws_prod aws_account ``` This assumes a file such as `aws.spc` exists in the Steampipe config directory with content like: ```hcl connection "aws_prod" { plugin = "aws" profile = "dundermifflin" regions = ["us-east-1", "us-west-2"] } ``` If your configuration files are stored in a different directory, specify the path with the `--config-dir` argument: ```bash steampipe_export_aws --connection aws_prod --config-dir ~/my/custom/config aws_account ``` This provides a cleaner and more reusable approach than `--config`, especially for managing multiple environments or teams. It also supports full Steampipe connection syntax including named connections, plugin configurations, credentials, and options. ## Filtering Results You can use `--limit` to specify the number of rows to return, which will reduce both the query time and the number of outbound API requests: ```bash steampipe_export_aws aws_ec2_instance --limit 3 ``` The `--select` argument allows you to specify which columns to return. Generally, you should select only the columns that you want in order to reduce the number of API calls, improve query performance, and minimize memory usage. Specify the columns you want, separated by a comma: ```bash steampipe_export_aws aws_ec2_instance --select instance_id,instance_type,account_id,region ``` The `--where` argument allows you to filter the rows based on key columns: ```bash steampipe_export_aws aws_ec2_instance --where "instance_type = 't2.micro'" ``` You can **only specify key columns** in `--where` because the Export CLI does the filtering server-side, via the API or service that it is calling. Refer to the table documentation in the [Steampipe hub](https://hub.steampipe.io/plugins) for a list of key columns (e.g. https://hub.steampipe.io/plugins/turbot/aws/tables/aws_ec2_instance#inspect). Note that you do not have to select the column to filter by it: ```bash steampipe_export_aws aws_ec2_instance --select instance_id,account_id,region,_ctx --where "instance_type = 't2.micro'" ``` The syntax for the `--where` argument generally follows the same structure as a SQL where clause comparison. Be aware that not all key columns support all operators (most only support `=` ) and you can only use the supported operators: ```bash steampipe_export_aws aws_ec2_instance --select instance_id,instance_state,account_id,region --where "instance_type like 't2.%'" key column for 'instance_type' does not support operator '~~' ``` You can specify multiple `--where` arguments, and they will be and'ed together: ```bash steampipe_export_aws aws_ec2_instance --select instance_id,account_id,region,_ctx --where "instance_type = 't2.micro'" --where "instance_state = 'stopped'" ``` ## Formatting output By default, the output is returned as CSV, but you can instead return as JSON: ```bash steampipe_export_aws aws_ec2_instance --select instance_id,account_id,region --output json ``` Or JSON lines (JSONL): ```bash steampipe_export_aws aws_ec2_instance --select instance_id,account_id,region --output jsonl ``` ## Logging You can set the logging level with the [STEAMPIPE_LOG_LEVEL](/docs/reference/env-vars/steampipe_log) environment variable. By default, the log level is set to `warn`. ```bash export STEAMPIPE_LOG_LEVEL=DEBUG ``` Logs are written to STDERR so by default they will be printed to the console. You can redirect them to a file instead with the standard file redirection mechanism: ```bash steampipe_export_aws aws_iam_policy 2> errors.log ``` --- --- title: Configure sidebar_label: Configure --- # Configuring Steampipe Postgres FDW To use the Steampipe Postgres FDW, you first have to create the foreign server and import the foreign schema. Login to Postgres as a superuser and create the extension: ```sql DROP EXTENSION IF EXISTS steampipe_postgres_aws CASCADE; CREATE EXTENSION IF NOT EXISTS steampipe_postgres_aws; ``` If you want, you can verify the extension was created: ```sql select * from pg_extension ``` Now create a foreign server. Many plugins include a default configuration that may "just work", but more often you will want to explicitly set the configuration by passing the `config` option to specify the plugin-specific configuration: ```sql DROP SERVER IF EXISTS steampipe_aws_01; CREATE SERVER steampipe_aws_01 FOREIGN DATA WRAPPER steampipe_postgres_aws OPTIONS (config 'profile = "my_aws_profile"'); ``` > [!IMPORTANT] > Many plugins use environment variables or configuration files from the user's $HOME directory for some configuration options. Be aware that the user context is whichever user Postgres is running as!*** The `config` option takes an HCL string with the plugin [connection](https://steampipe.io/docs/managing/connections) arguments. These arguments vary per plugin. You can view the available options and syntax for the plugin in the [Steampipe hub](https://hub.steampipe.io/plugins). Note that HCL is newline-sensitive. To specify multiple arguments, you must include the line break inside the string: ```sql CREATE SERVER steampipe_aws_01 FOREIGN DATA WRAPPER steampipe_postgres_aws OPTIONS (config 'access_key="AKIA4YFAKEKEYT99999" secret_key="A32As+zuuBFThisIsAFakeSecretNb77HSLmcB" regions = ["*"]'); ``` If you want, you can verify the foreign server was created: ```sql select * from information_schema.foreign_servers select * from information_schema.foreign_server_options ``` Now that the server has been set up, create a schema and import the foreign tables: ```sql DROP SCHEMA IF EXISTS aws_01 CASCADE; CREATE SCHEMA aws_01; COMMENT ON SCHEMA aws_01 IS 'steampipe aws fdw'; IMPORT FOREIGN SCHEMA aws_01 FROM SERVER steampipe_aws_01 INTO aws_01; ``` You can query the information schema to see the foreign tables that have been added to your schema: ```sql select foreign_table_name from information_schema.foreign_tables where foreign_table_schema = 'aws_01' ``` ```sql --------------------------------------------------------------+ | foreign_table_name | |--------------------------------------------------------------| | aws_wellarchitected_workload | | aws_guardduty_finding | | aws_vpc_verified_access_instance | | aws_cloudformation_stack_set | | aws_route53_resolver_rule | | aws_securityhub_insight | | aws_securityhub_member | ... ``` Your FDW is now configured! You should now be able to run queries! ```sql select * from aws_01.aws_account; ``` You can install as many Steampipe Postgres FDWs as you like. The installation process is the same for all plugins, though the `config` arguments vary. ## Multiple Foreign Servers You can create multiple foreign servers for the same extension (plugin type). For instance, you can add a foreign server and schema for each of your AWS accounts. Because the configuration is set on the foreign server, you need to create a new foreign server for each distinct instance. You will re-use the extension that you created for the first AWS foreign server: ```sql DROP SERVER IF EXISTS steampipe_aws_02; CREATE SERVER steampipe_aws_02 FOREIGN DATA WRAPPER steampipe_postgres_aws OPTIONS (config 'profile = "my_aws_profile_2"'); ``` Now that the server has been set up, create a schema and import the foreign tables: ```sql DROP SCHEMA IF EXISTS aws_02 CASCADE; CREATE SCHEMA aws_02; COMMENT ON SCHEMA aws_02 IS 'steampipe aws fdw - aws_02'; IMPORT FOREIGN SCHEMA aws_02 FROM SERVER steampipe_aws_02 INTO aws_02; ``` You can now query the tables in your new schema: ```sql select * from aws_02.aws_account ``` You can even create views to aggregate them: ```sql CREATE VIEW aws_account AS select * from aws_01.aws_account union all select * from aws_02.aws_account ``` ```sql select * from aws_account ``` ## Editing the Configuration If desired, you can change the foreign server configuration by editing the `config` option: ```sql ALTER SERVER steampipe_aws_01 OPTIONS (SET config 'profile = "my_new_profile" regions = ["*"]'); ``` ## Removing the configuration You can remove the FDW configuration by dropping the relevant objects: ``` DROP SCHEMA IF EXISTS aws01 CASCADE; DROP SERVER IF EXISTS steampipe_aws_01; DROP EXTENSION IF EXISTS steampipe_postgres_aws CASCADE; ``` ## Caching By default, query results are cached for 5 minutes. You can change the duration with the [STEAMPIPE_CACHE_MAX_TTL](/docs/reference/env-vars/steampipe_cache_max_ttl): ```bash export STEAMPIPE_CACHE_MAX_TTL=600 # 10 minutes ``` or disable caching with the [STEAMPIPE_CACHE](/docs/reference/env-vars/steampipe_cache): ```bash export STEAMPIPE_CACHE=false ``` ## Logging You can set the logging level with the [STEAMPIPE_LOG_LEVEL](/docs/reference/env-vars/steampipe_log) environment variable. By default, the log level is set to `warn`. Logs are written to the Postgres database logs. ```bash export STEAMPIPE_LOG_LEVEL=DEBUG ``` --- --- title: Install sidebar_label: Install --- # Installing Steampipe Postgres FDW Each Steampipe plugin is distributed as a distinct Steampipe Postgres FDW. They are available for download in the **Releases** for the corresponding plugin repo, however it is simplest to install them with the [Steampipe Postgres FDW install script](https://steampipe.io/install/postgres.sh): ```bash /bin/sh -c "$(curl -fsSL https://steampipe.io/install/postgres.sh)" ``` The installer will prompt you for the plugin name and version, determine the OS, system architecture, and Postgres version, and download the appropriate package. It will use `pg_config` determine where to install the files, prompt for confirmation, and then copy them: ```bash $ /bin/sh -c "$(curl -fsSL https://steampipe.io/install/postgres.sh)" Enter the plugin name: aws Enter the version (latest): Discovered: - PostgreSQL version: 15 - PostgreSQL location: /Applications/Postgres.app/Contents/Versions/15 - Operating system: Darwin - System architecture: arm64 Based on the above, steampipe_postgres_aws.pg15.darwin_arm64.tar.gz will be downloaded, extracted and installed at: /Applications/Postgres.app/Contents/Versions/15 Proceed with installing Steampipe PostgreSQL FDW for version 15 at /Applications/Postgres.app/Contents/Versions/15? - Press 'y' to continue with the current version. - Press 'n' to customize your PostgreSQL installation directory and select a different version. (Y/n): Downloading https://api.github.com/repos/turbot/steampipe-plugin-aws/releases/latest/releases/assets/139269139... ###################################################################################################################################################################### 100.0% x steampipe_postgres_aws.pg15.darwin_arm64/ x steampipe_postgres_aws.pg15.darwin_arm64/steampipe_postgres_aws.so x steampipe_postgres_aws.pg15.darwin_arm64/create_extension_aws.sql x steampipe_postgres_aws.pg15.darwin_arm64/install.sh x steampipe_postgres_aws.pg15.darwin_arm64/steampipe_postgres_aws--1.0.sql x steampipe_postgres_aws.pg15.darwin_arm64/steampipe_postgres_aws.control Download and extraction completed. Installing steampipe_postgres_aws in /Applications/Postgres.app/Contents/Versions/15... Successfully installed steampipe_postgres_aws extension! Files have been copied to: - Library directory: /Applications/Postgres.app/Contents/Versions/15/lib/postgresql - Extension directory: /Applications/Postgres.app/Contents/Versions/15/share/postgresql/extension/ ``` If you don't want to use the installer, you can download, extract, and install the files yourself. There are downloadable `tar.gz` packages for all platforms available in the **Releases** for the corresponding plugin's Github repo (e.g. https://github.com/turbot/steampipe-plugin-aws/releases). The `tar.gz` includes an `install.sh` script that you can run to install files, or you can copy them manually: ``` export LIBDIR=$(pg_config --pkglibdir) export EXTDIR=$(pg_config --sharedir)/extension/ sudo cp steampipe*.so $LIBDIR sudo cp steampipe*.sql $EXTDIR sudo cp steampipe*.control $EXTDIR ``` --- --- title: Steampipe Postgres FDWs sidebar_label: Postgres FDWs --- # Steampipe Postgres FDWs Steampipe Postgres FDWs are native Postgres Foreign Data Wrappers that translate APIs to foreign tables. Unlike Steampipe CLI, which ships with its own Postgres server instance, the Steampipe Postgres FDWs can be installed in any supported Postgres database version. --- --- title: Query sidebar_label: Query --- # Querying Steampipe Postgres FDW Your Steampipe Postgres FDW adds foreign tables to your Postgres installation. Typically, these tables are prefixed with the plugin name. There is extensive documentation for the plugin in the [Steampipe Hub](https://hub.steampipe.io/plugins), including sample queries for each table. You can also query the information schema to list the foreign tables that have been added to your schema: ```sql select foreign_table_name from information_schema.foreign_tables where foreign_table_schema = 'aws_01' ``` You can use standard Postgres syntax to query the tables. Note that you will have to qualify the table names with the schema name unless you add the schema to the [search path](https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PATH): ```sql select instance_id, instance_type instance_state, region, account_id from aws_01.aws_ec2_instance ``` There are many [examples in the Steampipe documentation](/docs/sql/steampipe-sql), as well as the [Steampipe Hub](https://hub.steampipe.io/plugins). These examples all use unqualified table names, so if you want to run them as-is, you'll need to add your schema to your search path: ```sql SELECT set_config('search_path', current_setting('search_path') || ',aws_01', false); show search_path; ``` You can now unqualified queries: ```sql select instance_id, instance_type instance_state, region, account_id from aws_ec2_instance ``` The search path will persist for the duration of your database session. You can revert to the default search path if you want: ```sql set search_path to default ``` Refer to the [documentation](https://www.postgresql.org/docs/current/ddl-schemas.html) for more details. --- --- title: Configure sidebar_label: Configure --- # Configuring Steampipe SQLite Extensions > [!IMPORTANT] > You must use a version of SQLite that has [extension loading](#enabling-sqlite-extension-loading) enabled!*** To use the Steampipe SQLite extension, you first have to load the extension module. Run SQLite, and in the SQLite shell load the extension with the `.load` command: ``` $ sqlite3 sqlite> .load ./steampipe_sqlite_extension_github.so ``` Once the extension is loaded, the virtual tables will appear. You can run the SQLite `pragma module_list` command to see them: ```sql sqlite> pragma module_list; pragma_table_info github_issue_comment json_each github_workflow github_traffic_view_weekly ... ``` Now that the extension is loaded, we have to configure it with plugin-specific options. Many plugins include a default configuration that may "just work", but you can explicitly set the configuration with the `steampipe_configure_{plugin}` function: ```sql sqlite> select steampipe_configure_github('token="ghp_Bt2iThisIsAFakeToken1234567"'); ``` Each extension includes its own `steampipe_configure` function that takes as its argument a string containing the HCL configuration options for the plugin. The options vary per plugin, and match the [connection](https://steampipe.io/docs/managing/connections) options for the corresponding plugin. You can view the available options and syntax for the plugin in the [Steampipe hub](https://hub.steampipe.io/plugins). Note that HCL is newline-sensitive. To specify multiple arguments, you must include the line break: ```sql sqlite> select steampipe_configure_aws(' access_key="AKIA4YFAKEKEYT99999" secret_key="A32As+zuuBFThisIsAFakeSecretNb77HSLmcB" '); ``` ## Persisting Your Configuration SQLite does not persist your module configuration; you need to load and configure the module(s) each time you start SQLite. Fortunately, SQLite provides options for loading these commands from a file. Create a file with the commands you wish to run when SQLite starts: ```sql -- Turn on column headers .headers ON -- Set output to table .mod table -- Load and Configure Github extension .load ./steampipe_sqlite_extension_github.so select steampipe_configure_github('token="ghp_Bt2iThisIsAFakeToken1234567"'); -- Load and Configure AWS extension .load ./steampipe_sqlite_extension_aws.so select steampipe_configure_aws(' access_key="AKIA4YFAKEKEYT99999" secret_key="A32As+zuuBFThisIsAFakeSecretNb77HSLmcB" '); ``` To load this *every time you start SQLite*, name the file `.sqliterc` and save it to the root of your home directory. Alternatively, you can give the file another name and then pass the file to the `--init` argument when starting SQLite: ```bash ./sqlite3 my_db --init ./init.sql ``` Or run the file after starting SQLite with the `.read` command: ```bash sqlite> .read ./init.sql ``` ## Enabling SQLite extension loading The Steampipe SQLite extensions are packaged as loadable modules. You must use a version of SQLite that has extension loading enabled. Some SQLite distributions (including the version that ships with MacOS) disable module loading as a compilation option, and you can't enable it. In this case, you have to install a version that supports extensions. You can download a precompiled SQLite binary for your platform [from the SQLite downloads page](https://www.sqlite.org/download.html) or use a package manager such as `brew`, `yum`, or `apt` to install it. If you try to run the `.load` command but you get an error like `Error: unknown command or invalid arguments: "load". Enter ".help" for help` you may not have extension loading enabled. If your installation has the `OMIT_LOAD_EXTENSION` compile option, then it does not support loadable modules: ```bash $ sqlite3 :memory: 'select * from pragma_compile_options()' | grep OMIT_LOAD_EXTENSION ``` ## Caching By default, query results are cached for 5 minutes. You can change the duration with the [STEAMPIPE_CACHE_MAX_TTL](/docs/reference/env-vars/steampipe_cache_max_ttl): ```bash export STEAMPIPE_CACHE_MAX_TTL=600 # 10 minutes ``` or disable caching with the [STEAMPIPE_CACHE](/docs/reference/env-vars/steampipe_cache): ```bash export STEAMPIPE_CACHE=false ``` ## Logging You can set the logging level with the [STEAMPIPE_LOG_LEVEL](/docs/reference/env-vars/steampipe_log) environment variable. By default, the log level is set to `warn`. ```bash export STEAMPIPE_LOG_LEVEL=DEBUG ``` SQLite logs are written to STDERR, and they will be printed to the console by default. You can redirect them to a file instead with the standard file redirection mechanism: ```bash sqlite3 2> errors.log ``` --- --- title: Install sidebar_label: Install --- # Installing Steampipe SQLite Extensions Each Steampipe plugin is distributed as a distinct Steampipe SQLite Extension. They are available for download in the **Releases** for the corresponding plugin repo, however it is simplest to install them with the [Steampipe SQLite install script](https://steampipe.io/install/sqlite.sh): ```bash /bin/sh -c "$(curl -fsSL https://steampipe.io/install/sqlite.sh)" ``` The installer will prompt you for the plugin name, version, and destination directory. It will then determine the OS and system architecture, and it will download and install the appropriate package. ```bash Enter the plugin name: github Enter version (latest): Enter location (current directory): Downloading steampipe_sqlite_github.darwin_arm64.tar.gz... ###################################################################################################################################################################### 100.0% x steampipe_sqlite_github.so steampipe_sqlite_github.darwin_arm64.tar.gz downloaded and extracted successfully at /Users/jsmyth/src/steampipe_anywhere/sqlite. ``` The installer will find the appropriate extension (packaged as a `.so` file) and download it to the current directory. If you don't want to use the installer, you can download, extract, and install the file yourself. There are downloadable `tar.gz` packages for all platforms available in the **Releases** for the corresponding plugin's Github repo (e.g. https://github.com/turbot/steampipe-plugin-aws/releases). --- --- title: Steampipe SQLite Extensions sidebar_label: SQLite Extensions --- # Steampipe SQLite Extensions Steampipe SQLite extensions provide a zero-ETL, SQLite-native query experience. The Steampipe SQLite extensions create SQLite virtual tables that translate your queries into API calls, transparently fetching information from your API or service as you request it. --- --- title: Query sidebar_label: Query --- # Querying Steampipe SQLite Extensions Your Steampipe extension adds virtual tables to your SQLite installation. Typically, these tables are prefixed with the plugin name. You can run `pragma module_list;` to get a list of virtual tables, or refer to the documentation for the plugin in the [Steampipe Hub](https://hub.steampipe.io/plugins). The Hub also contains sample queries for each table. You can use standard SQLite syntax to query the tables: ```sql select name, is_private, owner_login from github_my_repository ``` It is often useful to use `limit` to discover what columns are available for a table without fetching too much data: ```sql select * from aws_iam_access_key limit 1 ``` The normal [Steampipe guidance](/docs/sql/tips) applies: - Select only the columns that you need. - Limit results with a `where` clause on key columns when possible. - Be aware that some tables *require* a where or join clause. ## SQLite Data Types Unlike Postgres, SQLite does not have [native data types](https://www.sqlite.org/datatype3.html) for Date/Time, Boolean, JSON, or IP addresses, so these columns are represented as `TEXT` or `NUMBER`. While the data types are not supported as native storage types, SQLite does provide functions to manipulate these types of data. ### Boolean Boolean values are stored as integers: `0` (false) and `1` (true): ```sql select name, bucket_policy_is_public from aws_s3_bucket where bucket_policy_is_public = 1; ``` As a result, implicit boolean comparisons work as you would expect: ```sql select name, bucket_policy_is_public from aws_s3_bucket where bucket_policy_is_public; ``` SQLite version 3.23.0 also recognizes the keywords `TRUE` and `FALSE`. They are essentially just aliases for `1` and `0`: ```sql select name, bucket_policy_is_public from aws_s3_bucket where bucket_policy_is_public = TRUE; ``` ### Date/Time Steampipe SQLite extensions store date time fields as text in RFC-3339 format. You can use [SQLite date and time functions](https://www.sqlite.org/lang_datefunc.html) to work with these columns. ```sql select access_key_id, user_name, status, create_date, julianday('now') - julianday(create_date) as age_in_days from aws_iam_access_key where age_in_days > 30; ``` ### JSON Steampipe SQLite extensions store JSON fields as JSONB-formatted text. You can use [SQLite JSON functions and operators](https://www.sqlite.org/json1.html) to work with this data. You can extract data with `json_extract`: ```sql select name, json_extract(acl, '$.Owner') as owner from aws_s3_bucket; ``` But SQLite version 3.38.0 and later support the `->` and `->>` operators, which are usually simpler: ```sql select name, acl -> 'Owner' ->> 'ID' as owner from aws_s3_bucket; ``` You can use the [json_each table-valued function](https://www.sqlite.org/json1.html#jeach) to treat JSON arrays as rows and use them to join tables: ```sql select i.instance_id, vol.volume_id, vol.size from aws_ebs_volume as vol, json_each(vol.attachments) as att join aws_ec2_instance as i on i.instance_id = att.value ->> 'InstanceId' order by i.instance_id; ``` ### INET/CIDR Currently, SQLite does not include any functions for IP addresses or CIDR data. There are multiple 3rd party extensions you can install that provide functions for working with IP address data.