New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: split cpp/overrunning-write into two
#7386
base: main
Are you sure you want to change the base?
Conversation
|
I'll now create the internal request and start a DCA experiment on this, and then maybe play around with some LGTM databases. |
3e8ac4d
to
3c01605
The code LGTM, I think (see my one comment). A few things need to happen before we can merge this:
- We need to run a DCA on the code-scanning suite to make sure the performance on that suite is good with this new query.
- We need to add a .qhelp file, which is probably just a copy/paste of the one we have for
cpp/overrunning-write. - We need to run the new query on the ~140 projects we usually test on (or even better: on all of LGTM) and make sure that the new query doesn't have too many FPs.
- Once that's done, we need to add the
ready-for-doc-reviewlabel on this PR so that the docs team can review the .qhelp file, the metadata, and stuff like that.
Great! |
125385b
to
a7ab402
7bc26a6
to
fa63b76
|
On v = &fc->fixvar[fixvar_idx++].var;
name = v->di_key;
STRCPY(name, "self");where struct dictitem_S
{
typval_T di_tv; // type and value of the variable
char_u di_flags; // DI_FLAGS_ flags (only used for variable)
char_u di_key[1]; // key (actually longer!)
};
typedef struct dictitem_S dictitem_T;and the items are Any thoughts? Is this ok, or should we try to detect such a case somehow? |
It would be great if we can filter those out. We should have code for this already. Take a look at https://github.com/github/codeql/blob/main/cpp/ql/lib/semmle/code/cpp/commons/Buffer.qll#L15. That predicate is actually already used to filter out some results in |
|
on |
Hmm, actually the full code snippet is // Set l:self to "selfdict". Use "name" to avoid a warning from
// some compiler that checks the destination size.
v = &fc->fixvar[fixvar_idx++].var;
name = v->di_key;
STRCPY(name, "self");Might it be that the aliasing they use to avoid compiler warnings is actually stumping our own code detecting this case? Or maybe it is because that field is not |
This indeed seems like a bug with range analysis, I've managed to reduce the false positive case to #7396 . Can you have a look @geoffw0 ? |
yes, I think that's it. As the copy is not done directly on Do you think we should enhance |
I think it's okay to accept this kind of FP for now (since it didn't stop increasing the precision of another query in #6760). But let's collect these FPs in an issue. |
| @@ -359,6 +376,17 @@ private int lengthInBase10(float f) { | |||
| result = f.log10().floor() + 1 | |||
| } | |||
|
|
|||
| private BufferWriteEstimationReason getEstimationReasonForIntegralExpression(Expr expr) { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure expr is sufficiently bound in this predicate, but let's test this with DCA before we make any changes. I'd be much less nervous about this if we added a bindingset[expr] to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a problem we might see it on performance? Should I add the tuple counting option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you can try adding tuple counting. Be sure to update DCA to get the latest updates to that feature.
9ac2177
to
cc8a646
|
I assume that it is not possible to do some sharing of the query help using query help inclusion? |
This splits the `cpp/overruning-write` into two separate queries based off on the reason for the estimation. If the overrun is detected based on non-trivial range analysis, the results are now marked by the new `cpp/very-likely-overruning-write` high precision query. If it is based on less precise, usually type based bounds, then it will still be marked by `cpp/overruning-write` which remains at medium precision.
Rather than testing for `TypeBoundsAnalysis`, we test that the reason is not `ValueFlowAnalysis` (which is reported by the new `cpp/very-likely-overruning-write` query), so that if a client has overridden `BufferWrite::getMaxData` the `NoSpecifiedEstimateReason` is taken into account.
Also update the help of `cpp/overruning-write`, as the case shown there will actually not be flagged by that query any more.
Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
This also restrict what we consider "non-trivial" range analysis, as we now require both ends to be non-trivially bounded for signed integers. This avoids false positives stemming from a non trivial upper bound but no meaningful lower bound, for example.
1535efc
to
4beaf2e
I did not know about this mechanism, I'll have a look, thanks! |
There were some false positives where something like
int x;
// ...
sprintf(buff, "%ld", (long)x);
was considered as if the parameter had a non-trivial range analysis only
because the range of `int` is smaller than the range for `long`, without
any non-trivial range analysis actually done on `x`.
These will now be reported by `OverrunWrite` instead.
Also added the relevant CERT C _and_ C++ standard references where they were missing, and did some minor stylistic tweaks to `OverrunWriteFloat.qhelp`.
|
The DCA experiment failed due to a memory issue on the join-order badness analysis (that's my fault, sorry!) It's still not fixed, but you can still generate the data locally by doing: when I do that, I get the following table (I've removed a few of the columns for simplicity): I think this shows that there might be a potential performance problem that we need to look into before we merge this. |
|
Docs first responder here |
| <li>Control the size of the buffer using a preprocessor define.</li> | ||
| <li>Replace the call to <code>strcpy</code> with <code>strncpy</code>, specifying the define as the maximum length to copy. This will prevent the buffer overflow.</li> | ||
| <li>Consider increasing the buffer size, say to 20 characters, so that the message is displayed correctly.</li> | ||
| <li>Control the size of the buffer by declaring it with a compile time constant</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit to be consistent with the other list items
| <li>Control the size of the buffer by declaring it with a compile time constant</li> | |
| <li>Control the size of the buffer by declaring it with a compile time constant.</li> |
| "qhelp.dtd"> | ||
| <qhelp> | ||
| <overview> | ||
| <p>The program performs a buffer copy or write operation with no upper limit on the size of the copy, and by analysing the bounds of the expressions involved it appears that certain inputs will cause a buffer overflow to occur in this case. In addition to causing program instability, techniques exist which may allow an attacker to use this vulnerability to execute arbitrary code.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
| <p>The program performs a buffer copy or write operation with no upper limit on the size of the copy, and by analysing the bounds of the expressions involved it appears that certain inputs will cause a buffer overflow to occur in this case. In addition to causing program instability, techniques exist which may allow an attacker to use this vulnerability to execute arbitrary code.</p> | |
| <p>The program performs a buffer copy or write operation with no upper limit on the size of the copy. By analyzing the bounds of the expressions involved, it appears that certain inputs will cause a buffer overflow to occur in this case. In addition to causing program instability, techniques exist which may allow an attacker to use this vulnerability to execute arbitrary code.</p> |
|
|
||
| <p>To fix this issue these changes should be made:</p> | ||
| <ul> | ||
| <li>Control the size of the buffer by declaring it with a compile time constant</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <li>Control the size of the buffer by declaring it with a compile time constant</li> | |
| <li>Control the size of the buffer by declaring it with a compile time constant.</li> |


This splits the
cpp/overruning-writeinto two separate queries basedoff on the reason for the estimation. If the overrun is detected based
on non-trivial range analysis, the results are now marked by the new
cpp/very-likely-overruning-writehigh precision query. If it is basedon less precise, usually type based bounds, then it will still be marked
by
cpp/overruning-writewhich remains at medium precision.Closes: github/codeql-c-team#751
The text was updated successfully, but these errors were encountered: