The Wayback Machine - http://web.archive.org/web/20201224210924/https://github.com/fishtown-analytics/dbt/issues/2828
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use default google cloud project if not supplied? #2828

Closed
max-sixty opened this issue Oct 10, 2020 · 3 comments
Closed

Use default google cloud project if not supplied? #2828

max-sixty opened this issue Oct 10, 2020 · 3 comments

Comments

@max-sixty
Copy link
Contributor

@max-sixty max-sixty commented Oct 10, 2020

Describe the feature

Currently using BigQuery requires defining your project in profiles.yml: https://docs.getdbt.com/reference/warehouse-profiles/bigquery-profile/

Google Cloud APIs generally fall back to the default project when one isn't specified. This is helpful for code that runs in multiple project environments — it'll reference the datasets in whatever project it's running in.

So the feature would be to align dbt with that standard, and allow for:

my-bigquery-db:
  target: dev
  outputs:
    dev:
      type: bigquery
      method: oauth
      # project: [GCP project id] <- uses the current project
      dataset: [the name of your dbt dataset] # You can also use "schema" here
      threads: [1 or more]
      timeout_seconds: 300
      location: US # Optional, one of US or EU
      priority: interactive
      retries: 1

Describe alternatives you've considered

Currently we have something like:

nimbus:
  target: main
  outputs:
    user: main:
      type: bigquery
      method: oauth
      project: "{{ env_var('PROJECT', 'project_foo') }}"

...and set $PROJECT to the result of gcloud config get-value project. This is OK, but some cruft.

(and if I'm missing something and there's any easy solution to this, that would be gratefully received!)

Who will this benefit?

BigQuery users, particularly those running across dev and prod environments

Are you interested in contributing this feature?

Not right now, given my other OSS work, but would be keen to contribute to dbt at some point!

Thank you!

@jtcohen6
Copy link
Contributor

@jtcohen6 jtcohen6 commented Oct 12, 2020

Thanks for the detailed proposal, @max-sixty. I think it makes sense to fall back to the default project configured by the gcloud user / service account if it's not specified in profiles.yml.

This isn't a change we would prioritize, and I'm glad to see you have a workaround in the meantime. I imagine it could be quite straightforward. I'll mark this a good first issue, for whenever you (or another community member) has the time.

@max-sixty
Copy link
Contributor Author

@max-sixty max-sixty commented Nov 22, 2020

I'm happy to have a look into this — any ideas on where to start? I'm not familiar with the code base.

  • Should the default be set when profiles.yml is parsed? Or when it's used? — Probably when it's used, and then caching it — would reduce the performance cost of shelling out; but maybe it's simplest to do at the parsing stage?
  • Where is profiles.yml parsed? What's the python object that holds the results?
  • dbt already pulls the project here — could we use that?
@jtcohen6
Copy link
Contributor

@jtcohen6 jtcohen6 commented Nov 23, 2020

You're looking in the right place: /plugins/bigquery/dbt/adapters/bigquery/connections.py.

Here is where dbt parses the database (with project as alias) out of the profile:

def _connection_keys(self):
return ('method', 'database', 'schema', 'location', 'priority',
'timeout_seconds', 'maximum_bytes_billed')

And here is where dbt uses that database value to generate a connection:

database = profile_credentials.database

In between those two is the code you linked to. I think it'd be a fairly straightforward change to add some logic that checks if database is none, and sets it to the user's default project_id accordingly.

@max-sixty max-sixty mentioned this issue Nov 24, 2020

3 of 4 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

2 participants
You can’t perform that action at this time.