The Wayback Machine - http://web.archive.org/web/20210816184601/https://github.com/aws-amplify/amplify-js/issues/8594
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Sign In with customState works on local dev/hosted staging, but not on production #8594

Open
3 tasks done
talyh opened this issue Jul 16, 2021 · 7 comments
Open
3 tasks done

Comments

@talyh
Copy link
Task lists! Give feedback

@talyh talyh commented Jul 16, 2021

Before opening, please confirm:

JavaScript Framework

React

Amplify APIs

Authentication

Amplify Categories

auth

Environment information

# Put output below this line
System:
    OS: macOS 11.4
    CPU: (12) x64 Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
    Memory: 119.77 MB / 16.00 GB
    Shell: 5.8 - /bin/zsh
  Binaries:
    Node: 12.16.1 - /usr/local/bin/node
    Yarn: 1.22.4 - /usr/local/bin/yarn
    npm: 6.13.4 - /usr/local/bin/npm
  Browsers:
    Chrome: 91.0.4472.114
    Firefox: 89.0.2
    Safari: 14.1.1
  npmPackages:
    @ampproject/toolbox-optimizer:  undefined ()
    @aws-amplify/auth: ^4.1.1 => 4.1.1
    @aws-amplify/core: ^4.1.3 => 4.1.3
    @babel/core:  undefined ()
    @googlemaps/js-api-loader: ^1.2.0 => 1.2.0
    @goproperly/eslint-config-properly-react: ^2.5.0 => 2.5.0
    @goproperly/linoleum: ^12.10.1 => 12.10.1
    @goproperly/web-util: 7.0.0 => 7.0.0
    @next/bundle-analyzer: ^10.0.3 => 10.0.3
    @reach/auto-id: ^0.12.1 => 0.12.1 (0.11.2, 0.15.0)
    @reach/disclosure: ^0.12.1 => 0.12.1
    @reach/menu-button: ^0.12.1 => 0.12.1
    @reach/tabs: ^0.15.0 => 0.15.0
    @seznam/compose-react-refs: ^1.0.4 => 1.0.4
    @testing-library/jest-dom: ^5.11.2 => 5.11.2
    @testing-library/react: ^10.4.7 => 10.4.7
    @testing-library/react-hooks: ^5.0.3 => 5.0.3
    algoliasearch: ^4.3.0 => 4.3.0
    amphtml-validator:  undefined ()
    apexcharts: ^3.20.2 => 3.20.2
    arg:  undefined ()
    async-retry:  undefined ()
    async-sema:  undefined ()
    babel-jest: ^26.2.2 => 26.2.2 (26.6.3)
    bfj:  undefined ()
    bootstrap: ^4.1.3 => 4.1.3
    buttercms: ^1.2.7 => 1.2.7
    cacache:  undefined ()
    cache-loader:  undefined ()
    ci-info:  undefined ()
    comment-json:  undefined ()
    compression:  undefined ()
    conf:  undefined ()
    content-type:  undefined ()
    cookie:  undefined ()
    css-loader:  undefined ()
    debug:  undefined ()
    deep-object-diff: ^1.1.0 => 1.1.0
    devalue:  undefined ()
    dot-prop-immutable: ^2.1.0 => 2.1.0
    escape-string-regexp:  undefined ()
    eslint: ^7.6.0 => 7.6.0
    fast-deep-equal: ^3.1.3 => 3.1.3
    file-loader:  undefined ()
    find-cache-dir:  undefined ()
    find-up:  undefined ()
    fresh:  undefined ()
    google-map-react: ^2.1.7 => 2.1.7
    gzip-size:  undefined ()
    h3-js: ^3.7.1 => 3.7.1
    html-entities: ^1.3.1 => 1.3.1
    http-proxy:  undefined ()
    husky: ^4.2.5 => 4.2.5
    ignore-loader:  undefined ()
    imagemin-mozjpeg: ^9.0.0 => 9.0.0
    imagemin-pngquant: ^9.0.0 => 9.0.0
    imagemin-svgo: ^8.0.0 => 8.0.0
    intl: ^1.2.5 => 1.2.5
    is-animated:  undefined ()
    is-docker:  undefined ()
    is-wsl:  undefined ()
    jest: ^26.6.3 => 26.6.3
    jest-styled-components: ^7.0.2 => 7.0.2
    jquery: ^3.5.1 => 3.5.1
    js-cookie: ^2.2.1 => 2.2.1
    json5:  undefined ()
    jsonwebtoken:  undefined ()
    lint-staged: ^10.2.11 => 10.2.11
    lity: ^2.4.1 => 2.4.1
    loader-utils:  undefined ()
    lodash.curry:  undefined ()
    lru-cache:  undefined ()
    moment: ^2.29.1 => 2.29.1
    nanoid:  undefined ()
    neo-async:  undefined ()
    next: ^10.2.0 => 10.2.0
    next-build-id: ^3.0.0 => 3.0.0
    next-compose-plugins: ^2.2.0 => 2.2.0
    next-optimized-images: ^2.6.2 => 2.6.2
    nextjs-sitemap-generator: ^1.3.1 => 1.3.1
    node-fetch: ^2.6.1 => 2.6.1
    ora:  undefined ()
    patch-package: ^6.4.7 => 6.4.7
    popper.js: ^1.16.1 => 1.16.1
    postcss-flexbugs-fixes:  undefined ()
    postcss-loader:  undefined ()
    postcss-preset-env:  undefined ()
    postcss-scss:  undefined ()
    prettier: ^2.2.1 => 2.2.1
    promise-memoize: ^1.2.1 => 1.2.1
    prop-types: ^15.7.2 => 15.7.2
    react: ^17.0.1 => 17.0.1
    react-apexcharts: ^1.3.7 => 1.3.7
    react-calendly: ^2.0.4 => 2.0.4
    react-compound-slider: ^2.5.0 => 2.5.0
    react-dom: ^17.0.1 => 17.0.1
    react-html-parser: ^2.0.2 => 2.0.2
    react-html-parser-demo:  0.0.0
    react-pose: ^4.0.10 => 4.0.10
    react-redux: ^7.2.0 => 7.2.0
    react-share: ^4.3.1 => 4.3.1
    react-slick: ^0.26.1 => 0.26.1
    react-stack-grid: ^0.7.1 => 0.7.1
    react-sticky-box: ^0.9.3 => 0.9.3
    react-test-renderer: ^16.13.1 => 16.13.1
    react-timeago: ^4.4.0 => 4.4.0
    recast:  undefined ()
    redux: ^4.1.0 => 4.1.0
    redux-actions: ^2.6.5 => 2.6.5
    redux-dynamic-modules: ^5.2.3 => 5.2.3
    redux-dynamic-modules-saga: ^5.2.3 => 5.2.3
    redux-saga: ^1.1.3 => 1.1.3
    redux-saga-test-plan: ^4.0.1 => 4.0.1
    redux-saga/effects:  undefined ()
    resolve-url-loader:  undefined ()
    rimraf: ^3.0.2 => 3.0.2 (2.6.3, 2.7.1)
    rollbar-sourcemap-webpack-plugin: ^3.2.0 => 3.2.0
    sass: ^1.35.1 => 1.35.1
    sass-loader:  undefined ()
    schema-utils:  undefined ()
    selfsigned: ^1.10.7 => 1.10.7
    semver:  undefined ()
    send:  undefined ()
    slick-carousel: ^1.8.1 => 1.8.1
    source-map:  undefined ()
    string-hash:  undefined ()
    strip-ansi:  undefined ()
    styled-components: ^5.1.1 => 5.1.1
    styled-components/macro:  undefined ()
    styled-components/native:  undefined ()
    styled-components/primitives:  undefined ()
    supercluster: ^7.1.0 => 7.1.0
    terser:  undefined ()
    text-table:  undefined ()
    thread-loader:  undefined ()
    twilio: ^3.65.0 => 3.65.0
    unistore:  undefined ()
    use-deep-compare-effect: ^1.4.0 => 1.4.0
    use-media: ^1.4.0 => 1.4.0
    use-resize-observer: ^7.0.0 => 7.0.0
    use-timeout: ^1.1.0 => 1.1.0
    uuid: ^8.3.2 => 8.3.2 (3.4.0, 7.0.3)
    victory: ^33.1.7 => 33.1.7
    web-vitals:  undefined ()
    webpack:  undefined ()
    webpack-sources:  undefined ()
  npmGlobalPackages:
    commitizen: 4.2.2
    jest: 26.6.3
    npm: 6.13.4
    typescript: 4.0.5

Describe the bug

TLDR: Google Sign In with customState works on local dev/hosted staging, but not on production.

CONTEXT

We run a NextJS react app, deployed within Vercel.
This app has an Amplify config file that maps to either a Prod or Staging Cognito user pool, based on an env variable that determines whether it's running prod or staging.

In both cases, that setup looks like this, with the XXXX just representing the appropriate values for each Cognito pool.

Auth: {
      region: 'us-east-1',
      userPoolId: 'us-east-XXXX',
      userPoolWebClientId: 'XXXX',
      oauth: {
        domain: 'xxxx',
        scope: ['email', 'openid', 'profile'],
        redirectSignIn: 'XXXX/validation/',
        redirectSignOut: 'XXXX/validation/',
        clientId: 'XXXX',
        responseType: 'code',
      },
    }

On their setup, these Cognito pools are the same (same policies, same attributes, etc), only with different domains under Domain Names. They also have equivalent apps setup, with OAuth configured the same on both sides.
Both the Prod and Staging Cognito pools have Identity Provider configuration pointing to the same Google Cloud Project, which is configured to accept request from both sources.
Both are associated to the same organization and region within AWS.


On app initialization, we start listening for auth events. That looks like:

useEffect(() => {
    // We want to initialize amplify before setting auth state to avoid race condition between them
    initializeAmplify();
    dispatch(setAmplifyInitialize(true));

    Hub.listen('auth', ({ payload: { event, data } }) => {
      switch (event) {
        //
        //  .... sign in & sign out events here
        //

        case 'customOAuthState':
          dispatch(loggedInRedirect({ customState: data }));
          break;
        default:
          break;
      }
    });
    // eslint-disable-next-line react-hooks/exhaustive-deps
  }, []);

In this context, `dispatch` is dispatching to our redux store, so the rest of the app can react to it.

Before triggering the sign in, we generate some relevant state, attached it to an encoded string, save it locally. We then send the encoded string as customState.

That looks like

    await Auth.federatedSignIn({
      provider: 'Google',
      customState: encodedState,
    });

The /validation page (set in the redirectSignIn) has logic to compare that locally saved state with what's in redux.
If they match, we redirect the user to the page they were at when sign in was initialized.

HOW THE ISSUE MANIFESTS

The flow above works well on local dev and hosted staging environments. We see the customState event firing and the corresponding state being set in redux.
In production, however, we don't see the event firing and no redux data.
There are no console or network errors and the user token is present. With the exception of the token, everything else seems to work.

We initially suspected a race condition happening in prod that didn't happen on staging, but after some experiments, discarded this theory.

We then used patch-package to add some logging to @amplify/auth.
With this logging, we were able to see
image
--> It's worth calling out that this silent errorring consumed many hours of debugging, so a minor feature request inside this larger bug report would be for the error to be more explicit to avoid wild goose chases.

The error above wasn't sufficient to indicate to us why this would yield different behaviours across different environments.

The querystring we receive in return has code and state in both cases.
The value of state in the QS seems to be encoded by Amplify, so hard to say what it actually contains in each case, but it is present at all times.

Expected behavior

  1. Given equivalent Cognito configurations, Amplify should behave the same
  2. Consistent behaviour across environments in regards to triggering customOAuthState or failing it
  3. Better communication of error in case of failure

Reproduction steps

The bug description has all the details I can provide, but I can't offer repro steps as I only have our own Cognito pools to test with.

Code Snippet

// Put your code below this line.

Log output

// Put your logs below this line


aws-exports.js

No response

Manual configuration

Auth: {
      region: 'us-east-1',
      userPoolId: 'us-east-XXXX',
      userPoolWebClientId: 'XXXX',
      oauth: {
        domain: 'xxxx',
        scope: ['email', 'openid', 'profile'],
        redirectSignIn: 'XXXX/validation/',
        redirectSignOut: 'XXXX/validation/',
        clientId: 'XXXX',
        responseType: 'code',
      },
    }

Additional configuration

No response

Mobile Device

No response

Mobile Operating System

No response

Mobile Browser

No response

Mobile Browser Version

No response

Additional information and screenshots

  • This issue was 100% consistent in its behaviour. Everyone in our team experienced the same result both in terms of what was working on staging and not working on production, discarding issues of local environments or uncommon conditions in the Google account.

  • We had a test version deployed to production which skipped the comparison and just executed the gated behaviour regardless. That version works, indicating the actual token validation / general Cognito/Amplify setup is ok. The only manifested issue is really that we don't get the customState event/data in production.

  • Digging into the actual data returned by the different Cognito pools / Amplify token exchange, the only significant difference we could see is that production returns a idToken.payload.nonce, which we don't see in staging. To the best of my understanding, this is not related to any configuration of ours, but rather something Cognito is deciding to do.

  • When we started dealing with this issue we were using
  "@aws-amplify/core": "^3.8.2",
  "@aws-amplify/auth": "^3.4.10",

We since updated to

  "@aws-amplify/auth": "^4.1.1",
  "@aws-amplify/core": "^4.1.3",

This didn't help matters


  • We initially suspected a race condition, so we experimented a lot with moving the Amplify initialization / event listener subscription. None of those experiments yielded a change in behaviour.

  • As with most modern apps, there's a lot that runs on our production environment. We've done lots of experiments disabling analytics and other similar things to try and remove variables, but no matter what, we were always seeing the same result.
@chrisbonifacio chrisbonifacio added this to Needs Triage in Issues Triaging via automation Jul 16, 2021
@chrisbonifacio chrisbonifacio self-assigned this Jul 21, 2021
@chrisbonifacio chrisbonifacio moved this from Needs Triage to To Be Reproduced in Issues Triaging Jul 21, 2021
@chrisbonifacio chrisbonifacio removed this from To Do in MLH Summer 2021 Jul 22, 2021
@talyh
Copy link
Author

@talyh talyh commented Jul 26, 2021

I noticed there's been some movement in the issue in terms of assignments, but no additional info.
Are you able to provide an update on this?

Thanks,

@evcodes
Copy link
Contributor

@evcodes evcodes commented Jul 28, 2021

Hello @talyh, I am working on reproducing this issue and will get back to you with an update before Friday.

Cheers,

Eddy

@Khairo-kh
Copy link

@Khairo-kh Khairo-kh commented Jul 29, 2021

Hello @talyh, Thank you for the detailed explanation. I'm trying to confirm if this is related to another open issue here. Can you try adding trailingSlash: true, in the next.config.js file and see if that has any effect on the issue described here?

Thanks!

@talyh
Copy link
Author

@talyh talyh commented Jul 30, 2021

@Khairo-kh - I just looked at our next.config.js and it already has trailingSlash: true.

@talyh
Copy link
Author

@talyh talyh commented Aug 3, 2021

@Khairo-kh @evcodes - Wondering if you have an update on this? Anything I can help with?

@Khairo-kh
Copy link

@Khairo-kh Khairo-kh commented Aug 5, 2021

Hey @talyh - Sorry for the delay! unfortunately, I have not had any success reproducing this issue. I can successfully get the custom state data (and sign-in) both in the local dev environment and with a production build deployed to Vercel. Please let me know if you would like to see the setup I have, I can add you to the test repo I used to try and reproduce this.

On a side note, you can enable Amplify logging without using a third-party package. Simply add window.LOG_LEVEL = 'DEBUG'; in your code and you will see the logs in the browser console. See this for reference.

@talyh
Copy link
Author

@talyh talyh commented Aug 6, 2021

Hi @Khairo-kh
Thanks for looking into the setup. As far as I can tell, there's nothing in our configuration (neither in Vercel nor in AWS) that'd lead to prod behave differently (though there are so many moving pieces that it's hard to say for sure). I did open a ticket for our AWS support and they confirm that both the staging and prod pools are identical in their configuration (sure, different app ids, different domains, etc, but the same setup). So I'm not sure where else to look in terms of configuration differences.

Maybe a different approach is:
Can you help me understand what are the conditions that'd trigger these errors we saw on the production environment?
image

(Also, thanks for pointing me to the logger! I missed that and indeed it'd have been really helpful during that investigation)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Issues Triaging
To Be Reproduced
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants