JS/Python/Ruby: Document how API graphs should be interpreted #8606

asgerf · 2022-03-30T11:20:19Z

I've written what I find to be a useful interpretation of API graphs.

I'm hoping we can use this PR to discuss and align on what is the "correct" way to interpret API graphs. Much of the text is language-agnostic so I'm hoping to port it to Python and Ruby as well, after we've iterated on it a bit.

I should mentioned that the example with the getter at the end doesn't work in the current JS implementation. Support for getters is in flight here -- should probably be merged first, so at not to make incorrect clams in the documentation.

I haven't renamed anything in this PR. Let's settle on the text before going through with renaming.

yoff

Excellent description!

javascript/ql/lib/semmle/javascript/ApiGraphs.qll

yoff

LGTM

calumgrant

Nice to get more clarity on this! I'll let an engineer give a final approval.

calumgrant · 2022-03-30T14:00:20Z

javascript/ql/lib/semmle/javascript/ApiGraphs.qll

+   *
+   * Because the implementation of the external library is not visible, it is not known exactly what operations
+   * it will perform on values that flow there. Instead, the edges starting from a def-node are operations that would
+   * lead to an observable effect within the current codebase; without knowing for certain if the library will actually perform


I don't understand "observable effect within the current codebase". Could you mean "represent a potential data flow"

Unfortunately I don't understand what "represent a potential data flow" is supposed to mean, and I can't find a different way to express it.

Was the intuition not clear from the example below?

I guess, in terms of dataflow, observable effects are just values flowing into the codebase from the library. However, from a security point of view, effects could be other things: They called my function, so I know we just sent an email.

calumgrant · 2022-03-31T08:26:26Z

javascript/ql/lib/semmle/javascript/ApiGraphs.qll

+   * A callback is passed to the external function `foo`. We can't know if `foo` will actually invoke this callback.
+   * But _if_ the library should decide to invoke the callback, then a value will flow into the current codebase via the `x` parameter.
+   * For that reason, an edge is generated representing the argument-passing operation that might be performed by `foo`.
+   * This edge is going from the def-node associated with the callback to the use-node associated with the parameter `x`.


Maybe clarify with "of the lambda" ?

We generally don't use the word "lambda" when talking about functions in the JS documentation. But there's only one parameter in the example, so I'm not sure why it needs clarification.

calumgrant · 2022-03-31T08:31:52Z

javascript/ql/lib/semmle/javascript/ApiGraphs.qll

+   * on the client side is actually the return-value of the getter.
+   *
+   * Although one may think of API graphs as a tool to find certain program elements in the codebase,
+   * it can lead to some situations where intuition does not match what works best in practice.


This last sentence might be worth clarifying. Is there a specific gotcha you have in mind?

Suggested change

* it can lead to some situations where intuition does not match what works best in practice.

* it can lead to situations, as in the above example, where intuition does not match what works best in practice.

☝️ Does that work better?

javascript/ql/lib/semmle/javascript/ApiGraphs.qll

asgerf · 2022-04-05T08:52:30Z

I've followed up with renaming of member predicates as discussed internally:

getAnImmediateUse -> getASource
getAUse -> getAValueReachableFromSource
getARhs -> getASink
getAValueReachingRhs -> getAValueReachingSink

One desirable aspect of this is that it's always clear which is used to define a source or a sink,

override predicate isSink(DataFlow::Node node) {
  node = API::moduleImport("foo").getParameter(0).getASink()
}

Alternatives

Some alternatives that have been brought up, so I'll just try and go over what they look like

getAnInput / getAnOutput
getAnOrigin / getADestination

Perhaps this is just me, but I find it ambiguous which of these represent an input to the current codebase, or an input to the function/parameter being modelled. The same is sort of true for getASource and getASink, but at least these correspond closely with other uses of source/sink.

isSink

Looking at what an isSink predicate would look like, I think getARhs = getAnInput = getADestination is the least confusing solution:

override predicate isSink(DataFlow::Node node) {
  node = API::moduleImport("foo").getParameter(0).getARhs() // old naming
}

override predicate isSink(DataFlow::Node node) {
  node = API::moduleImport("foo").getParameter(0).getASink()
}

override predicate isSink(DataFlow::Node node) {
  node = API::moduleImport("foo").getParameter(0).getAnInput()
}

override predicate isSink(DataFlow::Node node) {
  node = API::moduleImport("foo").getParameter(0).getADestination()
}

isSource

Looking at some examples of isSource:

override predicate isSource(DataFlow::Node node) {
  node = API::moduleImport("foo").getParameter(0).getParameter(0).getAnImmediateUse() // old naming
  or
  node = API::moduleImport("foo").getReturn().getAnImmediateUse()
}

override predicate isSource(DataFlow::Node node) {
  node = API::moduleImport("foo").getParameter(0).getParameter(0).getASource()
  or
  node = API::moduleImport("foo").getReturn().getASource()
}

override predicate isSource(DataFlow::Node node) {
  node = API::moduleImport("foo").getParameter(0).getParameter(0).getAnOutput()
  or
  node = API::moduleImport("foo").getReturn().getAnOutput()
}

override predicate isSource(DataFlow::Node node) {
  node = API::moduleImport("foo").getParameter(0).getParameter(0).getAnOrigin()
  or
  node = API::moduleImport("foo").getReturn().getAnOrigin()
}

Matched code

Lastly, looking at some code snippets matched by this:

const foo = require('foo')

foo(); // API::moduleImport("foo").getReturn().getAnImmediateUse()
foo(); // API::moduleImport("foo").getReturn().getASource()
foo(); // API::moduleImport("foo").getReturn().getAnOutput()
foo(); // API::moduleImport("foo").getReturn().getAnOrigin()

foo(x); // x = API::moduleImport("foo").getParameter(0).getARhs()
foo(x); // x = API::moduleImport("foo").getParameter(0).getASink()
foo(x); // x = API::moduleImport("foo").getParameter(0).getAnInput()
foo(x); // x = API::moduleImport("foo").getParameter(0).getADestination()

Conclusion (or lack thereof)

Looking at the above, I find it hard to point out a clear winner here. I think source/sink has a lower chance of getting mixed up, but feel free to discuss. Looking at the commit history may also give an indication of whether the source/sink naming "feels right" at the use site.

...rimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointFeatures.qll

erik-krogh

LGTM 👍

A very optional rewriting of getAValueReachableFromSource.

javascript/ql/lib/semmle/javascript/ApiGraphs.qll

asgerf · 2022-05-18T13:13:58Z

Rebased to resolve conflicts, and fixed a comment.

hvitved · 2022-05-18T13:55:37Z

javascript/ql/lib/semmle/javascript/ApiGraphs.qll

+   * 3. Map the resulting API graph nodes to data-flow nodes, using `getASource` or `getASink`.
+   *
+   * For example, a simplified way to get arguments to `underscore.extend` would be
+   * ```codeql


...rimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointFeatures.qll

erik-krogh · 2022-05-23T16:50:40Z

You're still missing some renamings in EndpointFeatures.qll.
(That is what QL-for-QL is complaining about).

Co-authored-by: yoff <lerchedahl@gmail.com>

Co-authored-by: Calum Grant <42069085+calumgrant@users.noreply.github.com>

Co-authored-by: Nick Rolfe <nickrolfe@github.com>

Co-authored-by: Erik Krogh Kristensen <erik-krogh@github.com>

asgerf added JS Python Ruby labels Mar 30, 2022

asgerf changed the title ~~JS: Document how API graphs should be interpreted~~ JS/Python/Ruby: Document how API graphs should be interpreted Mar 30, 2022

asgerf marked this pull request as ready for review March 30, 2022 11:54

asgerf requested a review from a team as a code owner March 30, 2022 11:54

asgerf added the WIP This is a work-in-progress, do not merge yet! label Mar 30, 2022

yoff requested changes Mar 30, 2022

View reviewed changes

javascript/ql/lib/semmle/javascript/ApiGraphs.qll Outdated Show resolved Hide resolved

asgerf added the no-change-note-required This PR does not need a change note label Mar 30, 2022

calumgrant reviewed Mar 30, 2022

View reviewed changes

javascript/ql/lib/semmle/javascript/ApiGraphs.qll Outdated Show resolved Hide resolved

erik-krogh reviewed Mar 30, 2022

View reviewed changes

javascript/ql/lib/semmle/javascript/ApiGraphs.qll Outdated Show resolved Hide resolved

yoff previously approved these changes Mar 30, 2022

View reviewed changes

calumgrant reviewed Mar 31, 2022

View reviewed changes

nickrolfe reviewed Mar 31, 2022

View reviewed changes

javascript/ql/lib/semmle/javascript/ApiGraphs.qll Outdated Show resolved Hide resolved

javascript/ql/lib/semmle/javascript/ApiGraphs.qll Outdated Show resolved Hide resolved

asgerf dismissed yoff’s stale review via 162faab April 4, 2022 10:50

asgerf requested a review from a team April 5, 2022 07:48

annarailton previously approved these changes Apr 5, 2022

View reviewed changes

...rimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/EndpointFeatures.qll Outdated Show resolved Hide resolved

erik-krogh previously approved these changes Apr 5, 2022

View reviewed changes

javascript/ql/lib/semmle/javascript/ApiGraphs.qll Outdated Show resolved Hide resolved

asgerf dismissed stale reviews from erik-krogh and annarailton via 42f7c6a April 7, 2022 08:56

erik-krogh previously approved these changes Apr 7, 2022

View reviewed changes

asgerf dismissed erik-krogh’s stale review via 33d18b4 May 18, 2022 13:13

asgerf force-pushed the js/api-graph-api branch from 42f7c6a to 33d18b4 Compare May 18, 2022 13:13

github-actions bot removed Ruby Python labels May 18, 2022

asgerf removed the WIP This is a work-in-progress, do not merge yet! label May 18, 2022

erik-krogh self-assigned this May 18, 2022

hvitved reviewed May 18, 2022

View reviewed changes

asgerf dismissed erik-krogh’s stale review via 4b346ef May 19, 2022 06:47

erik-krogh previously approved these changes May 19, 2022

View reviewed changes

asgerf mentioned this pull request May 20, 2022

JS: API graph support for accessors (and classes members) #9234

Merged

asgerf dismissed erik-krogh’s stale review via 814676f May 23, 2022 13:55

github-advanced-security bot found potential problems May 23, 2022

View reviewed changes

asgerf and others added 19 commits May 24, 2022 11:57

JS: Document how API graphs should be interpreted

6a12864

Mention that the interaction and be with any external codebase

82c35e6

Update javascript/ql/lib/semmle/javascript/ApiGraphs.qll

73baa49

Co-authored-by: yoff <lerchedahl@gmail.com>

Update javascript/ql/lib/semmle/javascript/ApiGraphs.qll

a7b73f4

Co-authored-by: Calum Grant <42069085+calumgrant@users.noreply.github.com>

JS: Rename getAnImmediateUse -> getASource

4c61926

JS: Rename getARhs -> getASink

19a5db9

JS: Also rename predicates on API::EntryPoint

ce9c3b3

JS: Make API::EntryPoint overrides optional

76ba782

JS: Autoformat

9fad4b8

Apply suggestions from code review

1ae97d9

Co-authored-by: Nick Rolfe <nickrolfe@github.com>

JS: Update doc comment

8da96ed

JS: Update ATM code

e2858b7

JS: Fix up qldoc for getAValueReachingSink

777d344

JS: Fix typo

1e96b1e

Update javascript/ql/lib/semmle/javascript/ApiGraphs.qll

18dc394

Co-authored-by: Erik Krogh Kristensen <erik-krogh@github.com>

JS: Update a comment mentioning getARhs

f80f8b6

JS: Use 'ql' language for markdown snippets

bc60126

JS: Rename Node.{getASource -> asSource, getASink -> asSink}

631527f

JS: Update ATM code

87cbf7b

asgerf force-pushed the js/api-graph-api branch from fe78cf7 to 87cbf7b Compare May 24, 2022 10:00

erik-krogh approved these changes May 24, 2022

View reviewed changes

asgerf merged commit cc42f2f into github:main May 30, 2022

This was referenced May 30, 2022

Ruby: API graph renaming an documentation #9364

Merged

Python: API graph renaming and documentation #9369

Merged

Dec	JAN	Feb
	06
2025	2026	2027

	* it can lead to some situations where intuition does not match what works best in practice.
	* it can lead to situations, as in the above example, where intuition does not match what works best in practice.

JS/Python/Ruby: Document how API graphs should be interpreted #8606

JS/Python/Ruby: Document how API graphs should be interpreted #8606

Uh oh!

Conversation

asgerf commented Mar 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yoff left a comment

Choose a reason for hiding this comment

Uh oh!

calumgrant left a comment

Choose a reason for hiding this comment

Uh oh!

calumgrant Mar 30, 2022

Choose a reason for hiding this comment

Uh oh!

asgerf Apr 4, 2022

Choose a reason for hiding this comment

Uh oh!

yoff Apr 5, 2022

Choose a reason for hiding this comment

Uh oh!

calumgrant Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

asgerf Apr 4, 2022

Choose a reason for hiding this comment

Uh oh!

calumgrant Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

asgerf Apr 4, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asgerf commented Apr 5, 2022

Alternatives

isSink

isSource

Matched code

Conclusion (or lack thereof)

Uh oh!

Uh oh!

erik-krogh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

asgerf commented May 18, 2022

Uh oh!

hvitved May 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

erik-krogh commented May 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

asgerf commented Mar 30, 2022 •

edited

Loading

hvitved May 18, 2022 •

edited

Loading

erik-krogh commented May 23, 2022 •

edited

Loading