Jump to content

Main menu Navigation ●Main page ●Contents ●Current events ●Random article ●About Wikipedia ●Contact us ●Donate Contribute ●Help ●Learn to edit ●Community portal ●Recent changes ●Upload file

●Create account ●Log in ●Create account ● Log in Pages for logged out editors learn more ●Contributions ●Talk

(Top) 1 History 2 Usage 3 Return code 4 Example 5 Comparison to diff 6 Other options 7 Limits 8 See also 9 References 10 External links

comm

●العربية ●Беларуская ●Čeština ●한국어 ●Italiano ●Magyar ●Русский ●Türkçe ●Українська Edit links ●Article ●Talk ●Read ●Edit ●View history Tools Actions ●Read ●Edit ●View history General ●What links here ●Related changes ●Upload file ●Special pages ●Permanent link ●Page information ●Cite this page ●Get shortened URL ●Download QR code ●Wikidata item Print/export ●Download as PDF ●Printable version Appearance From Wikipedia, the free encyclopedia

This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (January 2013) (Learn how and when to remove this message)

comm
Example usage of `comm` command
Original author(s)	Lee E. McMahon
Developer(s)	AT&T Bell Laboratories, Richard Stallman, David MacKenzie
Initial release	November 1973; 50 years ago (1973-11)
Written in	C
Operating system	Unix, Unix-like, Plan 9, Inferno
Platform	Cross-platform
Type	Command
License	coreutils: GPLv3+ Plan 9: MIT License

The comm command in the Unix family of computer operating systems is a utility that is used to compare two files for common and distinct lines. comm is specified in the POSIX standard. It has been widely available on Unix-like operating systems since the mid to late 1980s.

History[edit]

Written by Lee E. McMahon, comm first appeared in Version 4 Unix.^[1]

The version of comm bundled in GNU coreutils was written by Richard Stallman and David MacKenzie.^[2]

Usage[edit]

comm reads two files as input, regarded as lines of text. comm outputs one file, which contains three columns. The first two columns contain lines unique to the first and second file, respectively. The last column contains lines common to both. This functionally is similar to diff.

Columns are typically distinguished with the <tab> character. If the input files contain lines beginning with the separator character, the output columns can become ambiguous.

For efficiency, standard implementations of comm expect both input files to be sequenced in the same line collation order, sorted lexically. The sort (Unix) command can be used for this purpose.

The comm algorithm makes use of the collating sequence of the current locale. If the lines in the files are not both collated in accordance with the current locale, the result is undefined.

Return code[edit]

Unlike diff, the return code from comm has no logical significance concerning the relationship of the two files. A return code of 0 indicates success, a return code >0 indicates an error occurred during processing.

Example[edit]

$ cat foo
apple
banana
eggplant
$ cat bar
apple
banana
banana
zucchini
$ comm foo bar
                  apple
                  banana
          banana
eggplant
          zucchini

This shows that both files have one banana, but only bar has a second banana.

In more detail, the output file has the appearance that follows. Note that the column is interpreted by the number of leading tab characters. \t represents a tab character and \n represents a newline (Escape character#Programming and data formats).

	0	1	2	3	4	5	6	7	8	9
0	\t	\t	a	p	p	l	e	\n
1	\t	\t	b	a	n	a	n	a	\n
2	\t	b	a	n	a	n	a	\n
3	e	g	g	p	l	a	n	t	\n
4	\t	z	u	c	c	h	i	n	i	\n

Comparison to diff[edit]

In general terms, diff is a more powerful utility than comm. The simpler comm is best suited for use in scripts.

The primary distinction between comm and diff is that comm discards information about the order of the lines prior to sorting.

A minor difference between comm and diff is that comm will not try to indicate that a line has "changed" between the two files; lines are either shown in the "from file #1", "from file #2", or "in both" columns. This can be useful if one wishes two lines to be considered different even if they only have subtle differences.

Other options[edit]

comm has command-line options to suppress any of the three columns. This is useful for scripting.

There is also an option to read one file (but not both) from standard input.

Limits[edit]

Up to a full line must be buffered from each input file during line comparison, before the next output line is written.

Some implementations read lines with the function readlinebuffer() which does not impose any line length limits if system memory suffices.

Other implementations read lines with the function fgets(). This function requires a fixed buffer. For these implementations, the buffer is often sized according to the POSIX macro LINE_MAX.

References[edit]

^ McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139.

^ "Comm(1): Compare two sorted files line by line - Linux man page".

External links[edit]

The Wikibook Guide to Unix has a page on the topic of: Commands

comm: select or reject lines common to two files – Shell and Utilities Reference, The Single UNIX Specification, Version 4 from The Open Group
comm(1) – Plan 9 Programmer's Manual, Volume 1
comm(1) – Inferno General commands Manual

v t e Unix command-line interface programs and shell builtins
File system	cat chattr chmod chown chgrp cksum cmp cp dd du df file fuser ln ls mkdir mv pax pwd rm rmdir split tee touch type umask
Processes	at bg crontab fg kill nice ps time
User environment	env exit logname mesg talk tput uname who write
Text processing	awk basename comm csplit cut diff dirname ed ex fold head iconv join m4 more nl paste patch printf read sed sort strings tail tr troff uniq vi wc xargs
Shell builtins	alias cd echo test unset wait
Searching	find grep
Documentation	man
Software development	ar ctags lex make nm strip yacc
Miscellaneous	bc cal expr lp od sleep true and false
Categories Standard Unix programs Unix SUS2008 utilities List

v t e Plan 9 command-line interface programs and shell builtins
File system	chmod chgrp cmp cp dd du file gzip ls mkdir pwd rm split tee touch
Processes	kill ps
User environment	passwd who
Text processing	awk basename comm diff ed eqn join sed sort spell strings tail tr troff uniq wc
Shell builtins	echo test
Networking	ip/ipconfig ip/ping netstat
Searching	grep
Software development	ar hoc lex nm strip yacc
Miscellaneous	bc cal fortune sleep
Category

GNU Core Utilities command-line interface programs

File system

Text utilities

Shell utilities

Retrieved from "https://en.wikipedia.org/w/index.php?title=Comm&oldid=1191977213" Categories: ●Free file comparison tools ●Standard Unix programs ●Unix SUS2008 utilities ●Plan 9 commands ●Inferno (operating system) commands Hidden categories: ●Articles with short description ●Short description matches Wikidata ●Articles lacking in-text citations from January 2013 ●All articles lacking in-text citations ●This page was last edited on 26 December 2023, at 22:53 (UTC). ●Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization. ●Privacy policy ●About Wikipedia ●Disclaimers ●Contact Wikipedia ●Code of Conduct ●Developers ●Statistics ●Cookie statement ●Mobile view