Share to: share facebook share twitter share wa share telegram print page

 

File (command)

file
Developer(s)AT&T Bell Laboratories
Initial release1973 (1973) as part of Unix Research Version 4; 1986 (1986) open-source reimplementation
Stable release
5.46[1] Edit this on Wikidata / 27 November 2024; 2 months ago (27 November 2024)
Repositorygithub.com/file/file
Written inC
Operating systemUnix, Unix-like, Plan 9, IBM i
PlatformCross-platform
TypeFile type detector
LicenseBSD license, CDDL
Plan 9: MIT License
Websitedarwinsys.com/file/

The file command is a standard program of Unix and Unix-like operating systems for recognizing the type of data contained in a computer file.

History

The original version of file originated in Unix Research Version 4[2] in 1973. System V brought a major update with several important changes, most notably moving the file type information into an external text file rather than compiling it into the binary itself.

Most major BSD and Linux distributions use a free, open-source reimplementation which was written in 1986–87 by Ian Darwin[3] from scratch; it keeps file type information in a text file with a format based on that of the System V version. It was expanded by Geoff Collyer in 1989 and since then has had input from many others, including Guy Harris, Chris Lowth and Eric Fischer; from late 1993 onward its maintenance has been organized by Christos Zoulas. The OpenBSD system has its own subset implementation written from scratch, but still uses the Darwin/Zoulas collection of magic file formatted information.

The file command has also been ported to the IBM i operating system.[4]

Specification

The Single UNIX Specification (SUS) specifies that a series of tests are performed on the file specified on the command line:

  1. if the file cannot be read, or its Unix file type is undetermined, the file program will indicate that the file was processed but its type was undetermined.
  2. file must be able to determine the types directory, FIFO, socket, block special file, and character special file
  3. zero-length files are identified as such
  4. an initial part of file is considered and file is to use position-sensitive tests
  5. the entire file is considered and file is to use context-sensitive tests
  6. the file is identified as a data file

file's position-sensitive tests are normally implemented by matching various locations within the file against a textual database of magic numbers (see the Usage section). This differs from other simpler methods such as file extensions and schemes like MIME.

In the System V implementation, the Ian Darwin implementation, and the OpenBSD implementation, the file command uses a database to drive the probing of the lead bytes. That database is implemented in a file called magic, whose location is usually in /etc/magic, /usr/share/file/magic or a similar location.

Usage

The SUS[5] mandates the following options:

  • -M file, specify a file specially formatted containing position-sensitive tests; default position-sensitive tests and context-sensitive tests will not be performed.
  • -m file, as for -M, but default tests will be performed after the tests contained in file.
  • -d, perform default position-sensitive and context-sensitive tests to the given file; this is the default behaviour unless -M or -m is specified.
  • -h, do not dereference symbolic links that point to an existing file or directory.
  • -L, dereference the symbolic link that points to an existing file or directory.
  • -i, do not classify the file further than to identify it as either: nonexistent, a block special file, a character special file, a directory, a FIFO, a socket, a symbolic link, or a regular file. Linux[6] and BSD[7] systems behave differently with this option and instead output an Internet media type ("MIME type") identifying the recognized file format.

Other Unix and Unix-like operating systems may add extra options than these. Ian Darwin's implementation adds -s 'special files', -k 'keep-going' or -r 'raw' (examples below), among many others.[6]

The command tells only what the file looks like, not what it is (in the case where file looks at the content). It is easy to fool the program by putting a magic number into a file the content of which does not match it. Thus the command is not usable as a security tool other than in specific situations.

Examples

$ file file.c
file.c: C program text
$ file program
program: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked
    (uses shared libs), stripped
$ file /dev/hda1
/dev/hda1: block special (0/0)
$ file -s /dev/hda1
/dev/hda1: Linux/i386 ext2 filesystem

Note that -s is a non-standard option available only on the Ian Darwin branch, which tells file to read device files and try to identify their contents rather than merely identifying them as device files. Normally file does not try to read device files since reading such a file can have undesirable side effects.

$ file -k -r libmagic-dev_5.35-4_armhf.deb    # (on Linux)
libmagic-dev_5.35-4_armhf.deb: Debian binary package (format 2.0)
- current ar archive
- data

Through Ian Darwin's non-standard option -k the program does not stop after the first hit found, but looks for other matching patterns. The -r option, which is available in some versions, causes the unprintable new line character to be displayed in its raw form rather than in its octal representation.

$ file compressed.gz
compressed.gz: gzip compressed data, deflated, original filename, `compressed', last
    modified: Thu Jan 26 14:08:23 2006, os: Unix
$ file -i compressed.gz    # (on Linux)
compressed.gz: application/x-gzip; charset=binary
$ file data.ppm
data.ppm: Netpbm PPM "rawbits" image data
$ file /bin/cat
/bin/cat: Mach-O universal binary with 2 architectures
/bin/cat (for architecture ppc7400):	Mach-O executable ppc
/bin/cat (for architecture i386):	Mach-O executable i386
$ file /usr/bin/vi
/usr/bin/vi: symbolic link to vim

Identifying symbolic links is not available on all platforms and will be dereferenced if -L is passed or POSIXLY_CORRECT is set.

Libmagic library

As of version 4.00 of the Ian Darwin/Christos Zoulas version of file, the functionality of file is incorporated into a libmagic library that is accessible via C (and C-compatible) linking;[8][9] file is implemented using that library.[10][11]

References

  1. ^ "[File] FIle 5.46 is now available". 27 November 2024. Retrieved 28 November 2024.
  2. ^ "Source of the UNIX V4 "file" man page". Archived from the original on 2019-12-10. Retrieved 2022-03-13.
  3. ^ The early history of this program is recorded in its private CVS repository; see [1] Archived 2017-04-01 at the Wayback Machine the log of the main program
  4. ^ "IBM System i Version 7.2 Programming Qshell" (PDF). IBM. Archived (PDF) from the original on 2021-03-05. Retrieved 2020-09-05.
  5. ^ "The Open Group Base Specifications Issue 7 — file command". Archived from the original on 2018-10-12. Retrieved 2014-08-20.
  6. ^ a b file(1) – Linux User Manual – User Commands
  7. ^ file(1) – NetBSD General Commands Manual
  8. ^ libmagic(3) – Linux Programmer's Manual – Library Functions
  9. ^ libmagic(3) – NetBSD Library Functions Manual
  10. ^ Zoulas, Christos (February 27, 2003). "file-3.41 is now available". File (Mailing list). Archived from the original on March 4, 2016. Retrieved January 1, 2013.
  11. ^ Zoulas, Christos (March 24, 2003). "file-4.00 is now available". File (Mailing list). Archived from the original on December 28, 2016. Retrieved January 1, 2013.

Manual pages

Other

  • Fine Free File Command – homepage for Ian Darwin's version of file used in major BSD and Linux distributions.
  • binwalk, a firmware analysis tool that carves files based on libmagic signatures
  • TrID, an alternative providing ranked answers (instead of just one) based on statistics.
  • Magika, an ML-based tool, by Google Research
Kembali kehalaman sebelumnya


Index: pl ar de en es fr it arz nl ja pt ceb sv uk vi war zh ru af ast az bg zh-min-nan bn be ca cs cy da et el eo eu fa gl ko hi hr id he ka la lv lt hu mk ms min no nn ce uz kk ro simple sk sl sr sh fi ta tt th tg azb tr ur zh-yue hy my ace als am an hyw ban bjn map-bms ba be-tarask bcl bpy bar bs br cv nv eml hif fo fy ga gd gu hak ha hsb io ig ilo ia ie os is jv kn ht ku ckb ky mrj lb lij li lmo mai mg ml zh-classical mr xmf mzn cdo mn nap new ne frr oc mhr or as pa pnb ps pms nds crh qu sa sah sco sq scn si sd szl su sw tl shn te bug vec vo wa wuu yi yo diq bat-smg zu lad kbd ang smn ab roa-rup frp arc gn av ay bh bi bo bxr cbk-zam co za dag ary se pdc dv dsb myv ext fur gv gag inh ki glk gan guw xal haw rw kbp pam csb kw km kv koi kg gom ks gcr lo lbe ltg lez nia ln jbo lg mt mi tw mwl mdf mnw nqo fj nah na nds-nl nrm nov om pi pag pap pfl pcd krc kaa ksh rm rue sm sat sc trv stq nso sn cu so srn kab roa-tara tet tpi to chr tum tk tyv udm ug vep fiu-vro vls wo xh zea ty ak bm ch ny ee ff got iu ik kl mad cr pih ami pwn pnt dz rmy rn sg st tn ss ti din chy ts kcg ve 
Prefix: a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9