1 ------------------------------------------------------------------------------
3 -- GNAT COMPILER COMPONENTS --
9 -- Copyright (C) 1992-2012, Free Software Foundation, Inc. --
11 -- GNAT is free software; you can redistribute it and/or modify it under --
12 -- terms of the GNU General Public License as published by the Free Soft- --
13 -- ware Foundation; either version 3, or (at your option) any later ver- --
14 -- sion. GNAT is distributed in the hope that it will be useful, but WITH- --
15 -- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY --
16 -- or FITNESS FOR A PARTICULAR PURPOSE. --
18 -- As a special exception under Section 7 of GPL version 3, you are granted --
19 -- additional permissions described in the GCC Runtime Library Exception, --
20 -- version 3.1, as published by the Free Software Foundation. --
22 -- You should have received a copy of the GNU General Public License and --
23 -- a copy of the GCC Runtime Library Exception along with this program; --
24 -- see the files COPYING3 and COPYING.RUNTIME respectively. If not, see --
25 -- <http://www.gnu.org/licenses/>. --
27 -- GNAT was originally developed by the GNAT team at New York University. --
28 -- Extensive contributions were provided by Ada Core Technologies Inc. --
30 ------------------------------------------------------------------------------
32 -- This package contains the input routines used for reading the
33 -- input source file. The actual I/O routines are in OS_Interface,
34 -- with this module containing only the system independent processing.
36 -- General Note: throughout the compiler, we use the term line or source
37 -- line to refer to a physical line in the source, terminated by the end of
38 -- physical line sequence.
40 -- There are two distinct concepts of line terminator in GNAT
42 -- A logical line terminator is what corresponds to the "end of a line" as
43 -- described in RM 2.2 (13). Any of the characters FF, LF, CR or VT or any
44 -- wide character that is a Line or Paragraph Separator acts as an end of
45 -- logical line in this sense, and it is essentially irrelevant whether one
46 -- or more appears in sequence (since if a sequence of such characters is
47 -- regarded as separate ends of line, then the intervening logical lines
48 -- are null in any case).
50 -- A physical line terminator is a sequence of format effectors that is
51 -- treated as ending a physical line. Physical lines have no Ada semantic
52 -- significance, but they are significant for error reporting purposes,
53 -- since errors are identified by line and column location.
55 -- In GNAT, a physical line is ended by any of the sequences LF, CR/LF, or
56 -- CR. LF is used in typical Unix systems, CR/LF in DOS systems, and CR
57 -- alone in System 7. In addition, we recognize any of these sequences in
58 -- any of the operating systems, for better behavior in treating foreign
59 -- files (e.g. a Unix file with LF terminators transferred to a DOS system).
60 -- Finally, wide character codes in categories Separator, Line and Separator,
61 -- Paragraph are considered to be physical line terminators.
64 with Casing; use Casing;
65 with Namet; use Namet;
67 with Types; use Types;
71 type Type_Of_File is (
72 -- Indicates type of file being read
75 -- Normal Ada source file
78 -- Configuration pragma file
81 -- Preprocessing definition file
84 -- Source file with preprocessing commands to be preprocessed
86 ----------------------------
87 -- Source License Control --
88 ----------------------------
90 -- The following type indicates the license state of a source if it
95 -- Licensing status of this source unit is unknown
98 -- This is a non-GPL'ed unit that is restricted from depending
99 -- on GPL'ed units (e.g. proprietary code is in this category)
102 -- This file is licensed under the unmodified GPL. It is not allowed
103 -- to depend on Non_GPL units, and Non_GPL units may not depend on
107 -- This file is licensed under the GNAT modified GPL (see header of
108 -- This file for wording of the modification). It may depend on other
109 -- Modified_GPL units or on unrestricted units.
112 -- The license on this file is permitted to depend on any other
113 -- units, or have other units depend on it, without violating the
114 -- license of this unit. Examples are public domain units, and
115 -- units defined in the RM).
117 -- The above license status is checked when the appropriate check is
118 -- activated and one source depends on another, and the licensing state
119 -- of both files is known:
121 -- The prohibited combinations are:
123 -- Restricted file may not depend on GPL file
125 -- GPL file may not depend on Restricted file
127 -- Modified GPL file may not depend on Restricted file
128 -- Modified_GPL file may not depend on GPL file
130 -- The reason for the last restriction here is that a client depending
131 -- on a modified GPL file must be sure that the license condition is
132 -- correct considered transitively.
134 -- The licensing status is determined either by the presence of a
135 -- specific pragma License, or by scanning the header for a predefined
136 -- file, or any file if compiling in -gnatg mode.
138 -----------------------
139 -- Source File Table --
140 -----------------------
142 -- The source file table has an entry for each source file read in for
143 -- this run of the compiler. This table is (default) initialized when
144 -- the compiler is loaded, and simply accumulates entries as compilation
145 -- proceeds and various routines in Sinput and its child packages are
146 -- called to load required source files.
148 -- Virtual entries are also created for generic templates when they are
149 -- instantiated, as described in a separate section later on.
151 -- In the case where there are multiple main units (e.g. in the case of
152 -- the cross-reference tool), this table is not reset between these units,
153 -- so that a given source file is only read once if it is used by two
154 -- separate main units.
156 -- The entries in the table are accessed using a Source_File_Index that
157 -- ranges from 1 to Last_Source_File. Each entry has the following fields
159 -- Note: fields marked read-only are set by Sinput or one of its child
160 -- packages when a source file table entry is created, and cannot be
161 -- subsequently modified, or alternatively are set only by very special
162 -- circumstances, documented in the comments.
164 -- File_Name : File_Name_Type (read-only)
165 -- Name of the source file (simple name with no directory information)
167 -- Full_File_Name : File_Name_Type (read-only)
168 -- Full file name (full name with directory info), used for generation
169 -- of error messages, etc.
171 -- File_Type : Type_Of_File (read-only)
172 -- Indicates type of file (source file, configuration pragmas file,
173 -- preprocessor definition file, preprocessor input file).
175 -- Reference_Name : File_Name_Type (read-only)
176 -- Name to be used for source file references in error messages where
177 -- only the simple name of the file is required. Identical to File_Name
178 -- unless pragma Source_Reference is used to change it. Only processing
179 -- for the Source_Reference pragma circuit may set this field.
181 -- Full_Ref_Name : File_Name_Type (read-only)
182 -- Name to be used for source file references in error messages where
183 -- the full name of the file is required. Identical to Full_File_Name
184 -- unless pragma Source_Reference is used to change it. Only processing
185 -- for the Source_Reference pragma may set this field.
187 -- Debug_Source_Name : File_Name_Type (read-only)
188 -- Name to be used for source file references in debugging information
189 -- where only the simple name of the file is required. Identical to
190 -- Reference_Name unless the -gnatD (debug source file) switch is used.
191 -- Only processing in Sprint that generates this file is permitted to
194 -- Full_Debug_Name : File_Name_Type (read-only)
195 -- Name to be used for source file references in debugging information
196 -- where the full name of the file is required. This is identical to
197 -- Full_Ref_Name unless the -gnatD (debug source file) switch is used.
198 -- Only processing in Sprint that generates this file is permitted to
201 -- License : License_Type;
202 -- License status of source file
204 -- Num_SRef_Pragmas : Nat;
205 -- Number of source reference pragmas present in source file
207 -- First_Mapped_Line : Logical_Line_Number;
208 -- This field stores logical line number of the first line in the
209 -- file that is not a Source_Reference pragma. If no source reference
210 -- pragmas are used, then the value is set to No_Line_Number.
212 -- Source_Text : Source_Buffer_Ptr (read-only)
213 -- Text of source file. Note that every source file has a distinct set
214 -- of non-overlapping logical bounds, so it is possible to determine
215 -- which file is referenced from a given subscript (Source_Ptr) value.
217 -- Source_First : Source_Ptr; (read-only)
218 -- Subscript of first character in Source_Text. Note that this cannot
219 -- be obtained as Source_Text'First, because we use virtual origin
222 -- Source_Last : Source_Ptr; (read-only)
223 -- Subscript of last character in Source_Text. Note that this cannot
224 -- be obtained as Source_Text'Last, because we use virtual origin
225 -- addressing, so this value is always Source_Ptr'Last.
227 -- Time_Stamp : Time_Stamp_Type; (read-only)
228 -- Time stamp of the source file
230 -- Source_Checksum : Word;
231 -- Computed checksum for contents of source file. See separate section
232 -- later on in this spec for a description of the checksum algorithm.
234 -- Last_Source_Line : Physical_Line_Number;
235 -- Physical line number of last source line. While a file is being
236 -- read, this refers to the last line scanned. Once a file has been
237 -- completely scanned, it is the number of the last line in the file,
238 -- and hence also gives the number of source lines in the file.
240 -- Keyword_Casing : Casing_Type;
241 -- Casing style used in file for keyword casing. This is initialized
242 -- to Unknown, and then set from the first occurrence of a keyword.
243 -- This value is used only for formatting of error messages.
245 -- Identifier_Casing : Casing_Type;
246 -- Casing style used in file for identifier casing. This is initialized
247 -- to Unknown, and then set from an identifier in the program as soon as
248 -- one is found whose casing is sufficiently clear to make a decision.
249 -- This value is used for formatting of error messages, and also is used
250 -- in the detection of keywords misused as identifiers.
252 -- Instantiation : Source_Ptr;
253 -- Source file location of the instantiation if this source file entry
254 -- represents a generic instantiation. Set to No_Location for the case
255 -- of a normal non-instantiation entry. See section below for details.
256 -- This field is read-only for clients.
258 -- Inlined_Body : Boolean;
259 -- This can only be set True if Instantiation has a value other than
260 -- No_Location. If true it indicates that the instantiation is actually
261 -- an instance of an inlined body.
263 -- Template : Source_File_Index; (read-only)
264 -- Source file index of the source file containing the template if this
265 -- is a generic instantiation. Set to No_Source_File for the normal case
266 -- of a non-instantiation entry. See Sinput-L for details.
268 -- Unit : Unit_Number_Type;
269 -- Identifies the unit contained in this source file. Set by
270 -- Initialize_Scanner, must not be subsequently altered.
272 -- The source file table is accessed by clients using the following
273 -- subprogram interface:
275 subtype SFI is Source_File_Index;
277 System_Source_File_Index : SFI;
278 -- The file system.ads is always read by the compiler to determine the
279 -- settings of the target parameters in the private part of System. This
280 -- variable records the source file index of system.ads. Typically this
281 -- will be 1 since system.ads is read first.
283 function Debug_Source_Name (S : SFI) return File_Name_Type;
284 function File_Name (S : SFI) return File_Name_Type;
285 function File_Type (S : SFI) return Type_Of_File;
286 function First_Mapped_Line (S : SFI) return Logical_Line_Number;
287 function Full_Debug_Name (S : SFI) return File_Name_Type;
288 function Full_File_Name (S : SFI) return File_Name_Type;
289 function Full_Ref_Name (S : SFI) return File_Name_Type;
290 function Identifier_Casing (S : SFI) return Casing_Type;
291 function Inlined_Body (S : SFI) return Boolean;
292 function Instantiation (S : SFI) return Source_Ptr;
293 function Keyword_Casing (S : SFI) return Casing_Type;
294 function Last_Source_Line (S : SFI) return Physical_Line_Number;
295 function License (S : SFI) return License_Type;
296 function Num_SRef_Pragmas (S : SFI) return Nat;
297 function Reference_Name (S : SFI) return File_Name_Type;
298 function Source_Checksum (S : SFI) return Word;
299 function Source_First (S : SFI) return Source_Ptr;
300 function Source_Last (S : SFI) return Source_Ptr;
301 function Source_Text (S : SFI) return Source_Buffer_Ptr;
302 function Template (S : SFI) return Source_File_Index;
303 function Unit (S : SFI) return Unit_Number_Type;
304 function Time_Stamp (S : SFI) return Time_Stamp_Type;
306 procedure Set_Keyword_Casing (S : SFI; C : Casing_Type);
307 procedure Set_Identifier_Casing (S : SFI; C : Casing_Type);
308 procedure Set_License (S : SFI; L : License_Type);
309 procedure Set_Unit (S : SFI; U : Unit_Number_Type);
311 function Last_Source_File return Source_File_Index;
312 -- Index of last source file table entry
314 function Num_Source_Files return Nat;
315 -- Number of source file table entries
317 procedure Initialize;
318 -- Initialize internal tables
321 -- Lock internal tables
324 -- Unlock internal tables
326 Main_Source_File : Source_File_Index := No_Source_File;
327 -- This is set to the source file index of the main unit
329 -----------------------------
330 -- Source_File_Index_Table --
331 -----------------------------
333 -- The Get_Source_File_Index function is called very frequently. Earlier
334 -- versions cached a single entry, but then reverted to a serial search,
335 -- and this proved to be a significant source of inefficiency. To get
336 -- around this, we use the following directly indexed array. The space
337 -- of possible input values is a value of type Source_Ptr which is simply
338 -- an Int value. The values in this space are allocated sequentially as
339 -- new units are loaded.
341 -- The following table has an entry for each 4K range of possible
342 -- Source_Ptr values. The value in the table is the lowest value
343 -- Source_File_Index whose Source_Ptr range contains value in the
346 -- For example, the entry with index 4 in this table represents Source_Ptr
347 -- values in the range 4*4096 .. 5*4096-1. The Source_File_Index value
348 -- stored would be the lowest numbered source file with at least one byte
351 -- The algorithm used in Get_Source_File_Index is simply to access this
352 -- table and then do a serial search starting at the given position. This
353 -- will almost always terminate with one or two checks.
355 -- Note that this array is pretty large, but in most operating systems
356 -- it will not be allocated in physical memory unless it is actually used.
358 Chunk_Power : constant := 12;
359 Chunk_Size : constant := 2 ** Chunk_Power;
360 -- Change comments above if value changed. Note that Chunk_Size must
361 -- be a power of 2 (to allow for efficient access to the table).
363 Source_File_Index_Table :
364 array (Int range 0 .. Int'Last / Chunk_Size) of Source_File_Index;
366 procedure Set_Source_File_Index_Table (Xnew : Source_File_Index);
367 -- Sets entries in the Source_File_Index_Table for the newly created
368 -- Source_File table entry whose index is Xnew. The Source_First and
369 -- Source_Last fields of this entry must be set before the call.
371 -----------------------
372 -- Checksum Handling --
373 -----------------------
375 -- As a source file is scanned, a checksum is computed by taking all the
376 -- non-blank characters in the file, excluding comment characters, the
377 -- minus-minus sequence starting a comment, and all control characters
380 -- The checksum algorithm used is the standard CRC-32 algorithm, as
381 -- implemented by System.CRC32, except that we do not bother with the
382 -- final XOR with all 1 bits.
384 -- This algorithm ensures that the checksum includes all semantically
385 -- significant aspects of the program represented by the source file,
386 -- but is insensitive to layout, presence or contents of comments, wide
387 -- character representation method, or casing conventions outside strings.
389 -- Scans.Checksum is initialized appropriately at the start of scanning
390 -- a file, and copied into the Source_Checksum field of the file table
391 -- entry when the end of file is encountered.
393 -------------------------------------
394 -- Handling Generic Instantiations --
395 -------------------------------------
397 -- As described in Sem_Ch12, a generic instantiation involves making a
398 -- copy of the tree of the generic template. The source locations in
399 -- this tree directly reference the source of the template. However it
400 -- is also possible to find the location of the instantiation.
402 -- This is achieved as follows. When an instantiation occurs, a new entry
403 -- is made in the source file table. This entry points to the same source
404 -- text, i.e. the file that contains the instantiation, but has a distinct
405 -- set of Source_Ptr index values. The separate range of Sloc values avoids
406 -- confusion, and means that the Sloc values can still be used to uniquely
407 -- identify the source file table entry. It is possible for both entries
408 -- to point to the same text, because of the virtual origin pointers used
409 -- in the source table.
411 -- The Instantiation field of this source file index entry, usually set
412 -- to No_Source_File, instead contains the Sloc of the instantiation. In
413 -- the case of nested instantiations, this Sloc may itself refer to an
414 -- instantiation, so the complete chain can be traced.
416 -- Two routines are used to build these special entries in the source
417 -- file table. Create_Instantiation_Source is first called to build
418 -- the virtual source table entry for the instantiation, and then the
419 -- Sloc values in the copy are adjusted using Adjust_Instantiation_Sloc.
420 -- See child unit Sinput.L for details on these two routines.
426 Current_Source_File : Source_File_Index := No_Source_File;
427 -- Source_File table index of source file currently being scanned.
428 -- Initialized so that some tools (such as gprbuild) can be built with
429 -- -gnatVa and pragma Initialized_Scalars without problems.
431 Current_Source_Unit : Unit_Number_Type;
432 -- Unit number of source file currently being scanned. The special value
433 -- of No_Unit indicates that the configuration pragma file is currently
434 -- being scanned (this has no entry in the unit table).
436 Source_gnat_adc : Source_File_Index := No_Source_File;
437 -- This is set if a gnat.adc file is present to reference this file
439 Source : Source_Buffer_Ptr;
440 -- Current source (copy of Source_File.Table (Current_Source_Unit).Source)
442 Internal_Source : aliased Source_Buffer (1 .. 81);
443 -- This buffer is used internally in the compiler when the lexical analyzer
444 -- is used to scan a string from within the compiler. The procedure is to
445 -- establish Internal_Source_Ptr as the value of Source, set the string to
446 -- be scanned, appropriately terminated, in this buffer, and set Scan_Ptr
447 -- to point to the start of the buffer. It is a fatal error if the scanner
448 -- signals an error while scanning a token in this internal buffer.
450 Internal_Source_Ptr : constant Source_Buffer_Ptr :=
451 Internal_Source'Unrestricted_Access;
452 -- Pointer to internal source buffer
454 -----------------------------------------
455 -- Handling of Source Line Terminators --
456 -----------------------------------------
458 -- In this section we discuss in detail the issue of terminators used to
459 -- terminate source lines. The RM says that one or more format effectors
460 -- (other than horizontal tab) end a source line, and defines the set of
461 -- such format effectors, but does not talk about exactly how they are
462 -- represented in the source program (since in general the RM is not in
463 -- the business of specifying source program formats).
465 -- The type Types.Line_Terminator is defined as a subtype of Character
466 -- that includes CR/LF/VT/FF. The most common line enders in practice
467 -- are CR (some MAC systems), LF (Unix systems), and CR/LF (DOS/Windows
468 -- systems). Any of these sequences is recognized as ending a physical
469 -- source line, and if multiple such terminators appear (e.g. LF/LF),
470 -- then we consider we have an extra blank line.
472 -- VT and FF are recognized as terminating source lines, but they are
473 -- considered to end a logical line instead of a physical line, so that
474 -- the line numbering ignores such terminators. The use of VT and FF is
475 -- mandated by the standard, and correctly handled in a conforming manner
476 -- by GNAT, but their use is not recommended.
478 -- In addition to the set of characters defined by the type in Types, in
479 -- wide character encoding, then the codes returning True for a call to
480 -- System.UTF_32.Is_UTF_32_Line_Terminator are also recognized as ending a
481 -- source line. This includes the standard codes defined above in addition
482 -- to NEL (NEXT LINE), LINE SEPARATOR and PARAGRAPH SEPARATOR. Again, as in
483 -- the case of VT and FF, the standard requires we recognize these as line
484 -- terminators, but we consider them to be logical line terminators. The
485 -- only physical line terminators recognized are the standard ones (CR,
488 -- However, we do not recognize the NEL (16#85#) character as having the
489 -- significance of an end of line character when operating in normal 8-bit
490 -- Latin-n input mode for the compiler. Instead the rule in this mode is
491 -- that all upper half control codes (16#80# .. 16#9F#) are illegal if they
492 -- occur in program text, and are ignored if they appear in comments.
494 -- First, note that this behavior is fully conforming with the standard.
495 -- The standard has nothing whatever to say about source representation
496 -- and implementations are completely free to make there own rules. In
497 -- this case, in 8-bit mode, GNAT decides that the 16#0085# character is
498 -- not a representation of the NEL character, even though it looks like it.
499 -- If you have NEL's in your program, which you expect to be treated as
500 -- end of line characters, you must use a wide character encoding such as
501 -- UTF-8 for this code to be recognized.
503 -- Second, an explanation of why we take this slightly surprising choice.
504 -- We have never encountered anyone actually using the NEL character to
505 -- end lines. One user raised the issue as a result of some experiments,
506 -- but no one has ever submitted a program encoded this way, in any of
507 -- the possible encodings. It seems that even when using wide character
508 -- codes extensively, the normal approach is to use standard line enders
509 -- (LF or CR/LF). So the failure to recognize NEL in this mode seems to
510 -- have no practical downside.
512 -- Moreover, what we have seen in a significant number of programs from
513 -- multiple sources is the practice of writing all program text in lower
514 -- half (ASCII) form, but using UTF-8 encoded wide characters freely in
515 -- comments, where the comments are terminated by normal line endings
516 -- (LF or CR/LF). The comments do not contain NEL codes, but they can and
517 -- do contain other UTF-8 encoding sequences where one of the bytes is the
518 -- NEL code. Now such programs can of course be compiled in UTF-8 mode,
519 -- but in practice they also compile fine in standard 8-bit mode without
520 -- specifying a character encoding. Since this is common practice, it would
521 -- be a signficant upwards incompatibility to recognize NEL in 8-bit mode.
527 procedure Backup_Line (P : in out Source_Ptr);
528 -- Back up the argument pointer to the start of the previous line. On
529 -- entry, P points to the start of a physical line in the source buffer.
530 -- On return, P is updated to point to the start of the previous line.
531 -- The caller has checked that a Line_Terminator character precedes P so
532 -- that there definitely is a previous line in the source buffer.
534 procedure Build_Location_String (Loc : Source_Ptr);
535 -- This function builds a string literal of the form "name:line", where
536 -- name is the file name corresponding to Loc, and line is the line number.
537 -- In the event that instantiations are involved, additional suffixes of
538 -- the same form are appended after the separating string " instantiated at
539 -- ". The returned string is appended to the Name_Buffer, terminated by
540 -- ASCII.NUL, with Name_Length indicating the length not including the
543 function Build_Location_String (Loc : Source_Ptr) return String;
544 -- Functional form returning a string, which does not include a terminating
545 -- null character. The contents of Name_Buffer is destroyed.
547 procedure Check_For_BOM;
548 -- Check if the current source starts with a BOM. Scan_Ptr needs to be at
549 -- the start of the current source. If the current source starts with a
550 -- recognized BOM, then some flags such as Wide_Character_Encoding_Method
551 -- are set accordingly, and the Scan_Ptr on return points past this BOM.
552 -- An error message is output and Unrecoverable_Error raised if a non-
553 -- recognized BOM is detected. The call has no effect if no BOM is found.
555 function Get_Column_Number (P : Source_Ptr) return Column_Number;
556 -- The ones-origin column number of the specified Source_Ptr value is
557 -- determined and returned. Tab characters if present are assumed to
558 -- represent the standard 1,9,17.. spacing pattern.
560 function Get_Logical_Line_Number
561 (P : Source_Ptr) return Logical_Line_Number;
562 -- The line number of the specified source position is obtained by
563 -- doing a binary search on the source positions in the lines table
564 -- for the unit containing the given source position. The returned
565 -- value is the logical line number, already adjusted for the effect
566 -- of source reference pragmas. If P refers to the line of a source
567 -- reference pragma itself, then No_Line is returned. If no source
568 -- reference pragmas have been encountered, the value returned is
569 -- the same as the physical line number.
571 function Get_Logical_Line_Number_Img
572 (P : Source_Ptr) return String;
573 -- Same as above function, but returns the line number as a string of
574 -- decimal digits, with no leading space. Destroys Name_Buffer.
576 function Get_Physical_Line_Number
577 (P : Source_Ptr) return Physical_Line_Number;
578 -- The line number of the specified source position is obtained by
579 -- doing a binary search on the source positions in the lines table
580 -- for the unit containing the given source position. The returned
581 -- value is the physical line number in the source being compiled.
583 function Get_Source_File_Index (S : Source_Ptr) return Source_File_Index;
584 -- Return file table index of file identified by given source pointer
585 -- value. This call must always succeed, since any valid source pointer
586 -- value belongs to some previously loaded source file.
588 function Instantiation_Depth (S : Source_Ptr) return Nat;
589 -- Determine instantiation depth for given Sloc value. A value of
590 -- zero means that the given Sloc is not in an instantiation.
592 function Line_Start (P : Source_Ptr) return Source_Ptr;
593 -- Finds the source position of the start of the line containing the
594 -- given source location.
597 (L : Physical_Line_Number;
598 S : Source_File_Index) return Source_Ptr;
599 -- Finds the source position of the start of the given line in the
600 -- given source file, using a physical line number to identify the line.
602 function Num_Source_Lines (S : Source_File_Index) return Nat;
603 -- Returns the number of source lines (this is equivalent to reading
604 -- the value of Last_Source_Line, but returns Nat rather than a
605 -- physical line number.
607 procedure Register_Source_Ref_Pragma
608 (File_Name : File_Name_Type;
609 Stripped_File_Name : File_Name_Type;
611 Line_After_Pragma : Physical_Line_Number);
612 -- Register a source reference pragma, the parameter File_Name is the
613 -- file name from the pragma, and Stripped_File_Name is this name with
614 -- the directory information stripped. Both these parameters are set
615 -- to No_Name if no file name parameter was given in the pragma.
616 -- (which can only happen for the second and subsequent pragmas).
617 -- Mapped_Line is the line number parameter from the pragma, and
618 -- Line_After_Pragma is the physical line number of the line that
619 -- follows the line containing the Source_Reference pragma.
621 function Original_Location (S : Source_Ptr) return Source_Ptr;
622 -- Given a source pointer S, returns the corresponding source pointer
623 -- value ignoring instantiation copies. For locations that do not
624 -- correspond to instantiation copies of templates, the argument is
625 -- returned unchanged. For locations that do correspond to copies of
626 -- templates from instantiations, the location within the original
627 -- template is returned. This is useful in canonicalizing locations.
629 function Instantiation_Location (S : Source_Ptr) return Source_Ptr;
630 pragma Inline (Instantiation_Location);
631 -- Given a source pointer S, returns the corresponding source pointer
632 -- value of the instantiation if this location is within an instance.
633 -- If S is not within an instance, then this returns No_Location.
635 function Top_Level_Location (S : Source_Ptr) return Source_Ptr;
636 -- Given a source pointer S, returns the argument unchanged if it is
637 -- not in an instantiation. If S is in an instantiation, then it returns
638 -- the location of the top level instantiation, i.e. the outer level
639 -- instantiation in the nested case.
641 function Physical_To_Logical
642 (Line : Physical_Line_Number;
643 S : Source_File_Index) return Logical_Line_Number;
644 -- Given a physical line number in source file whose source index is S,
645 -- return the corresponding logical line number. If the physical line
646 -- number is one containing a Source_Reference pragma, the result will
647 -- be No_Line_Number.
649 procedure Skip_Line_Terminators
650 (P : in out Source_Ptr;
651 Physical : out Boolean);
652 -- On entry, P points to a line terminator that has been encountered,
653 -- which is one of FF,LF,VT,CR or a wide character sequence whose value is
654 -- in category Separator,Line or Separator,Paragraph. P points just past
655 -- the character that was scanned. The purpose of this routine is to
656 -- distinguish physical and logical line endings. A physical line ending
659 -- CR on its own (MAC System 7)
660 -- LF on its own (Unix and unix-like systems)
661 -- CR/LF (DOS, Windows)
662 -- Wide character in Separator,Line or Separator,Paragraph category
664 -- Note: we no longer recognize LF/CR (which we did in some earlier
665 -- versions of GNAT. The reason for this is that this sequence is not
666 -- used and recognizing it generated confusion. For example given the
667 -- sequence LF/CR/LF we were interpreting that as (LF/CR) ending the
668 -- first line and a blank line ending with CR following, but it is
669 -- clearly better to interpret this as LF, with a blank line terminated
670 -- by CR/LF, given that LF and CR/LF are both in common use, but no
671 -- system we know of uses LF/CR.
673 -- A logical line ending (that is not a physical line ending) is one of:
678 -- On return, P is bumped past the line ending sequence (one of the above
679 -- seven possibilities). Physical is set to True to indicate that a
680 -- physical end of line was encountered, in which case this routine also
681 -- makes sure that the lines table for the current source file has an
682 -- appropriate entry for the start of the new physical line.
684 procedure Sloc_Range (N : Node_Id; Min, Max : out Source_Ptr);
685 -- Given a node, returns the minimum and maximum source locations of any
686 -- node in the syntactic subtree for the node. This is not quite the same
687 -- as the locations of the first and last token in the node construct
688 -- because parentheses at the outer level do not have a recorded Sloc.
690 -- Note: if the tree for the expression contains no "real" Sloc values,
691 -- i.e. values > No_Location, then both Min and Max are set to Sloc (Expr).
693 function Source_Offset (S : Source_Ptr) return Nat;
694 -- Returns the zero-origin offset of the given source location from the
695 -- start of its corresponding unit. This is used for creating canonical
696 -- names in some situations.
698 procedure Write_Location (P : Source_Ptr);
699 -- Writes out a string of the form fff:nn:cc, where fff, nn, cc are the
700 -- file name, line number and column corresponding to the given source
701 -- location. No_Location and Standard_Location appear as the strings
702 -- <no location> and <standard location>. If the location is within an
703 -- instantiation, then the instance location is appended, enclosed in
704 -- square brackets (which can nest if necessary). Note that this routine
705 -- is used only for internal compiler debugging output purposes (which
706 -- is why the somewhat cryptic use of brackets is acceptable).
708 procedure wl (P : Source_Ptr);
709 pragma Export (Ada, wl);
710 -- Equivalent to Write_Location (P); Write_Eol; for calls from GDB
712 procedure Write_Time_Stamp (S : Source_File_Index);
713 -- Writes time stamp of specified file in YY-MM-DD HH:MM.SS format
716 -- Initializes internal tables from current tree file using the relevant
717 -- Table.Tree_Read routines.
719 procedure Tree_Write;
720 -- Writes out internal tables to current tree file using the relevant
721 -- Table.Tree_Write routines.
724 pragma Inline (File_Name);
725 pragma Inline (First_Mapped_Line);
726 pragma Inline (Full_File_Name);
727 pragma Inline (Identifier_Casing);
728 pragma Inline (Instantiation);
729 pragma Inline (Keyword_Casing);
730 pragma Inline (Last_Source_Line);
731 pragma Inline (Last_Source_File);
732 pragma Inline (License);
733 pragma Inline (Num_SRef_Pragmas);
734 pragma Inline (Num_Source_Files);
735 pragma Inline (Num_Source_Lines);
736 pragma Inline (Reference_Name);
737 pragma Inline (Set_Keyword_Casing);
738 pragma Inline (Set_Identifier_Casing);
739 pragma Inline (Source_First);
740 pragma Inline (Source_Last);
741 pragma Inline (Source_Text);
742 pragma Inline (Template);
743 pragma Inline (Time_Stamp);
745 -------------------------
746 -- Source_Lines Tables --
747 -------------------------
749 type Lines_Table_Type is
750 array (Physical_Line_Number) of Source_Ptr;
751 -- Type used for lines table. The entries are indexed by physical line
752 -- numbers. The values are the starting Source_Ptr values for the start
753 -- of the corresponding physical line. Note that we make this a bogus
754 -- big array, sized as required, so that we avoid the use of fat pointers.
756 type Lines_Table_Ptr is access all Lines_Table_Type;
757 -- Type used for pointers to line tables
759 type Logical_Lines_Table_Type is
760 array (Physical_Line_Number) of Logical_Line_Number;
761 -- Type used for logical lines table. This table is used if a source
762 -- reference pragma is present. It is indexed by physical line numbers,
763 -- and contains the corresponding logical line numbers. An entry that
764 -- corresponds to a source reference pragma is set to No_Line_Number.
765 -- Note that we make this a bogus big array, sized as required, so that
766 -- we avoid the use of fat pointers.
768 type Logical_Lines_Table_Ptr is access all Logical_Lines_Table_Type;
769 -- Type used for pointers to logical line tables
771 -----------------------
772 -- Source_File Table --
773 -----------------------
775 -- See earlier descriptions for meanings of public fields
777 type Source_File_Record is record
778 File_Name : File_Name_Type;
779 Reference_Name : File_Name_Type;
780 Debug_Source_Name : File_Name_Type;
781 Full_Debug_Name : File_Name_Type;
782 Full_File_Name : File_Name_Type;
783 Full_Ref_Name : File_Name_Type;
784 Num_SRef_Pragmas : Nat;
785 First_Mapped_Line : Logical_Line_Number;
786 Source_Text : Source_Buffer_Ptr;
787 Source_First : Source_Ptr;
788 Source_Last : Source_Ptr;
789 Source_Checksum : Word;
790 Last_Source_Line : Physical_Line_Number;
791 Instantiation : Source_Ptr;
792 Template : Source_File_Index;
793 Unit : Unit_Number_Type;
794 Time_Stamp : Time_Stamp_Type;
795 File_Type : Type_Of_File;
796 Inlined_Body : Boolean;
797 License : License_Type;
798 Keyword_Casing : Casing_Type;
799 Identifier_Casing : Casing_Type;
801 -- The following fields are for internal use only (i.e. only in the
802 -- body of Sinput or its children, with no direct access by clients).
804 Sloc_Adjust : Source_Ptr;
805 -- A value to be added to Sloc values for this file to reference the
806 -- corresponding lines table. This is zero for the non-instantiation
807 -- case, and set so that the addition references the ultimate template
808 -- for the instantiation case. See Sinput-L for further details.
810 Lines_Table : Lines_Table_Ptr;
811 -- Pointer to lines table for this source. Updated as additional
812 -- lines are accessed using the Skip_Line_Terminators procedure.
813 -- Note: the lines table for an instantiation entry refers to the
814 -- original line numbers of the template see Sinput-L for details.
816 Logical_Lines_Table : Logical_Lines_Table_Ptr;
817 -- Pointer to logical lines table for this source. Non-null only if
818 -- a source reference pragma has been processed. Updated as lines
819 -- are accessed using the Skip_Line_Terminators procedure.
821 Lines_Table_Max : Physical_Line_Number;
822 -- Maximum subscript values for currently allocated Lines_Table
823 -- and (if present) the allocated Logical_Lines_Table. The value
824 -- Max_Source_Line gives the maximum used value, this gives the
825 -- maximum allocated value.
829 -- The following representation clause ensures that the above record
830 -- has no holes. We do this so that when instances of this record are
831 -- written by Tree_Gen, we do not write uninitialized values to the file.
833 AS : constant Pos := Standard'Address_Size;
835 for Source_File_Record use record
836 File_Name at 0 range 0 .. 31;
837 Reference_Name at 4 range 0 .. 31;
838 Debug_Source_Name at 8 range 0 .. 31;
839 Full_Debug_Name at 12 range 0 .. 31;
840 Full_File_Name at 16 range 0 .. 31;
841 Full_Ref_Name at 20 range 0 .. 31;
842 Num_SRef_Pragmas at 24 range 0 .. 31;
843 First_Mapped_Line at 28 range 0 .. 31;
844 Source_First at 32 range 0 .. 31;
845 Source_Last at 36 range 0 .. 31;
846 Source_Checksum at 40 range 0 .. 31;
847 Last_Source_Line at 44 range 0 .. 31;
848 Instantiation at 48 range 0 .. 31;
849 Template at 52 range 0 .. 31;
850 Unit at 56 range 0 .. 31;
851 Time_Stamp at 60 range 0 .. 8 * Time_Stamp_Length - 1;
852 File_Type at 74 range 0 .. 7;
853 Inlined_Body at 75 range 0 .. 7;
854 License at 76 range 0 .. 7;
855 Keyword_Casing at 77 range 0 .. 7;
856 Identifier_Casing at 78 range 0 .. 15;
857 Sloc_Adjust at 80 range 0 .. 31;
858 Lines_Table_Max at 84 range 0 .. 31;
860 -- The following fields are pointers, so we have to specialize their
861 -- lengths using pointer size, obtained above as Standard'Address_Size.
863 Source_Text at 88 range 0 .. AS - 1;
864 Lines_Table at 88 range AS .. AS * 2 - 1;
865 Logical_Lines_Table at 88 range AS * 2 .. AS * 3 - 1;
868 for Source_File_Record'Size use 88 * 8 + AS * 3;
869 -- This ensures that we did not leave out any fields
871 package Source_File is new Table.Table (
872 Table_Component_Type => Source_File_Record,
873 Table_Index_Type => Source_File_Index,
874 Table_Low_Bound => 1,
875 Table_Initial => Alloc.Source_File_Initial,
876 Table_Increment => Alloc.Source_File_Increment,
877 Table_Name => "Source_File");
883 procedure Alloc_Line_Tables
884 (S : in out Source_File_Record;
886 -- Allocate or reallocate the lines table for the given source file so
887 -- that it can accommodate at least New_Max lines. Also allocates or
888 -- reallocates logical lines table if source ref pragmas are present.
890 procedure Add_Line_Tables_Entry
891 (S : in out Source_File_Record;
893 -- Increment line table size by one (reallocating the lines table if
894 -- needed) and set the new entry to contain the value P. Also bumps
895 -- the Source_Line_Count field. If source reference pragmas are
896 -- present, also increments logical lines table size by one, and
899 procedure Trim_Lines_Table (S : Source_File_Index);
900 -- Set lines table size for entry S in the source file table to
901 -- correspond to the current value of Num_Source_Lines, releasing
902 -- any unused storage. This is used by Sinput.L and Sinput.D.