Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Value  





2 How it works  





3 Escape sequences  



3.1  Escape  





3.2  Newline  





3.3  Hex  





3.4  Octal  





3.5  Universal character names  







4 Alternatives  





5 See also  





6 References  





7 Further reading  














Escape sequences in C






Deutsch
 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


In the C programming language, an escape sequence is specially delimited text in a characterorstring literal that represents one or more other characters to the compiler. It allows a programmer to specify characters that are otherwise difficult or impossible to specify in a literal.

An escape sequence starts with a backslash (\) called the escape character and subsequent characters define the meaning of the escape sequence. For example, \n denotes a newline character.

The same or similar escape sequences are used in other, related languages such C++, C#, Java and PHP.

Value

[edit]

To demonstrate the value of the escape sequence feature, to output the text Foo on one line and Bar on the next line, the code must output a newline between the two words.

The following code achieves the goal via text formatting and a hard-coded ASCII character value for newline (0x0A). This behaves as desired with the words on sequential lines, but an escape sequence has advantages.

#include <stdio.h>
int main() {
    printf("Foo%cBar", 0x0A);
    return 0;
}

The \n escape sequence allows for shorter code by specifying the newline in the string literal, and for faster runtime by eliminating the text formatting operation. Also, the compiler can map the escape sequence to a character encoding system other than ASCII and thus make the code more portable.

#include <stdio.h>
int main() {
    printf("Foo\nBar");
    return 0;
}

How it works

[edit]

An escape sequence changes how the compiler interprets character data in a literal. For example, \n does not represent a backslash followed by the letter n. The backslash escapes the compiler's normal, literal way of interpreting character data. After a backslash, the compiler expects subsequent characters to complete one of the defined escape sequences, and then translates the escape sequence into the characters it represents.

This syntax does require special handling to encode a backslash character – since it is a metacharacter that changes literal interpretation behavior; not the literal backslash character. The issue is solved by using two backslashes (\\) to mean one.

Escape sequences

[edit]

The following table includes escape sequences defined in standard C as well as some non-standard sequences. The C standard requires an escape sequence that does not match a defined sequence to be diagnosed – i.e., the compiler must print an error message. Regardless, some compilers define additional escape sequences.

The table shows the ASCII value a sequence maps to, however, it may map to different values based on encoding.

Escape sequence Hex value in ASCII Character represented
\a 07 Alert (Beep, Bell) (added in C89)[1]
\b 08 Backspace
\enote 1 1B Escape character
\f 0C Formfeed Page Break
\n 0A Newline (Line Feed); see below
\r 0D Carriage Return
\t 09 Horizontal Tab
\v 0B Vertical Tab
\\ 5C Backslash
\' 27 Apostrophe or single quotation mark
\" 22 Double quotation mark
\? 3F Question mark (used to avoid trigraphs)
\nnnnote 2 nnn (octal) The byte whose numerical value is given by nnn interpreted as an octal number
\xhh… hh… The byte whose numerical value is given by hh… interpreted as a hexadecimal number
\uhhhhnote 3 non-ASCII Unicode code point below 10000 hexadecimal (added in C99)[1]: 26 
\Uhhhhhhhhnote 4 non-ASCII Unicode code point where h is a hexadecimal digit

Escape

[edit]

^ The non-standard sequence \e represents the escape characterinGCC,[2] clang and tcc. It was not added to the C standard because it has no meaningful equivalent in some character sets (such as EBCDIC).[1]

Newline

[edit]

Sequence \n maps to one byte, despite the fact that the platform may use more than one byte to denote a newline, such as the DOS/Windows CRLF sequence, 0x0D 0x0A. The translation from 0x0Ato0x0D 0x0A on DOS and Windows occurs when the byte is written out to a file or to the console, and the inverse translation is done when text files are read.

Hex

[edit]

Ahex escape sequence must have at least one hex digit following \x, with no upper bound; it continues for as many hex digits as there are. Thus, for example, \xABCDEFG denotes the byte with the numerical value ABCDEF16, followed by the letter G, which is not a hex digit. However, if the resulting integer value is too large to fit in a single byte, the actual numerical value assigned is implementation-defined. Most platforms have 8-bit char types, which limits a useful hex escape sequence to two hex digits. However, hex escape sequences longer than two hex digits might be useful inside a wide character or wide string literal (prefixed with L):

// single char with value 0x12 (18 decimal)
char s1[] = "\x12";
// single char with implementation-defined value, unless char is long enough
char s1[] = "\x1234";
// single wchar_t with value 0x1234, provided wchar_t is long enough (16 bits suffices)
wchar_t s2[] = L"\x1234";

Octal

[edit]

^Anoctal escape sequence consists of a backslash followed by one to three octal digits. The octal escape sequence ends when it either contains three octal digits, or the next character is not an octal digit. For example, \11 is an octal escape sequence denoting a byte with decimal value 9 (11 in octal). However, \1111 is the octal escape sequence \111 followed by the digit 1. In order to denote the byte with numerical value 1, followed by the digit 1, one could use "\1""1", since C concatenates adjacent string literals.

Some three-digit octal escape sequences are too large to fit in a single byte. This results in an implementation-defined value for the resulting byte.

The escape sequence \0 is a commonly used octal escape sequence, which denotes the null character, with value zero in ASCII and most encoding systems.

Universal character names

[edit]

^ ^ Since the C99 standard, C supports escape sequences that denote Unicode code points, called universal character names. They have the form \uhhhhor\Uhhhhhhhh, where h stands for a hex digit. Unlike other escape sequences, a universal character name may expand into more than one code unit.

The sequence \uhhhh denotes the code point hhhh, interpreted as a hexadecimal number. The sequence \Uhhhhhhhh denotes the code point hhhhhhhh, interpreted as a hexadecimal number. Code points located at U+10000 or higher must be denoted with the \U syntax, whereas lower code points may use \uor\U. The code point is converted into a sequence of code units in the encoding of the destination type on the target system. For example, where the encoding is UTF-8, and UTF-16 for wchar_t:

// A single byte with the value 0xC0; not valid UTF-8
char s1[] = "\xC0";
// Two bytes with values 0xC3, 0x80; the UTF-8 encoding of U+00C0
char s2[] = "\u00C0";
// A single wchar_t with the value 0x00C0
wchar_t s3[] = L"\xC0";
// A single wchar_t with the value 0x00C0
wchar_t s4[] = L"\u00C0";

A value greater than \U0000FFFF may be represented by a single wchar_t if the UTF-32 encoding is used, or two if UTF-16 is used.

Importantly, the universal character name \u00C0 always denotes the character "À", regardless of what kind of string literal it is used in, or the encoding in use. The octal and hex escape sequences always denote certain sequences of numerical values, regardless of encoding. Therefore, universal character names are complementary to octal and hex escape sequences; while octal and hex escape sequences represent code units, universal character names represent code points, which may be thought of as "logical" characters.

Alternatives

[edit]

Some languages provide different mechanisms for coding behavior that the escape sequence provide. For example, the following Pascal code writes the two words on sequential lines:

writeln('Foo');
write('Bar');

writeln outputs a newline after the parameter text, while write does not.

See also

[edit]

References

[edit]
  1. ^ a b c "Rationale for International Standard - Programming Languages - C" (PDF). 5.10. April 2003. Archived (PDF) from the original on 2016-06-06. Retrieved 2010-10-17.
  • ^ "6.35 The Character <ESC> in Constants". GCC 4.8.2 Manual. Archived from the original on 2019-05-12. Retrieved 2014-03-08.
  • Further reading

    [edit]
    Retrieved from "https://en.wikipedia.org/w/index.php?title=Escape_sequences_in_C&oldid=1217938345"

    Categories: 
    C (programming language)
    Control characters
    Hidden categories: 
    Articles with short description
    Short description is different from Wikidata
    Articles lacking in-text citations from September 2013
    All articles lacking in-text citations
    Use American English from March 2019
    All Wikipedia articles written in American English
    Use dmy dates from May 2019
     



    This page was last edited on 8 April 2024, at 19:55 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki