1 <!doctype linuxdoctr system>
3 <!-- $Id: Lex-YACC-HOWTO.sgml,v 1.2 2002/07/22 14:02:09 gferg Exp $
8 <!-- Title information -->
9 <title>Lex and YACC primer/HOWTO
11 <author>PowerDNS BV (bert hubert <bert@powerdns.com>)&nl;
12 <date>v0.8 $Date: 2002/07/22 14:02:09 $
15 <trans>ÂçÀ¾ Âç¼ù (daiki onishi <onishi@mbc.nifty.com>)&nl;
16 <tdate>v0.8j 2003/02/08
21 This document tries to help you get started using Lex and YACC
24 Ëܥɥ¥å¥á¥ó¥È¤Ï Lex ¤È YACC ¤Î´ðËÜŪ¤Ê»È¤¤Êý¤Ë¤Ä¤¤¤Æµ½Ò¤·¤Þ¤¹
28 <!-- Table of contents -->
31 <!-- Begin the document -->
44 Welcome, gentle reader.
47 ¤è¤¦¤³¤½¡¢¹âµ®¤Ê¤ëÆɼԤΤߤʤµ¤ó
52 If you have been programming for any length of time in a Unix environment,
53 you will have encountered the mystical programs Lex & YACC, or as they
54 are known to GNU/Linux users worldwide, Flex & Bison, where Flex is a
55 Lex implementation by Vern Paxon and Bison the GNU version of YACC. We will
56 call these programs Lex and YACC throughout - the newer versions are
57 upwardly compatible, so you can use Flex and Bison when trying our examples.
60 Unix ´Ä¶¤Ç¡¢¤¢¤ëÄøÅÙ¥×¥í¥°¥é¥ß¥ó¥°·Ð¸³¤òÀѤޤì¤Æ¤¤¤ëÊý¤Ê¤é¡¢Lex &
61 YACC¡¢¤â¤·¤¯¤Ï GNU/Linux ¥æ¡¼¥¶¤Î´Ö¤Ç Flex & Bison ¤È¤·¤ÆÃΤé¤ì¤Æ
62 ¤¤¤ë¡¢¿ÀÈëŪ¤Ê¥×¥í¥°¥é¥à¤ò¤´Â¸ÃΤΤ³¤È¤È»×¤¤¤Þ¤¹¡£Flex ¤È¤Ï¡¢Vern
63 Paxon ¤Ë¤è¤ë Lex ¼ÂÁõ¤Ç¤¢¤ê¡¢Bison ¤È¤Ï GNU ÈÇ YACC ¤Ç¤¹¡£°Ê²¼¡¢ÃǤê¤Î
64 ¤Ê¤¤¸Â¤ê¤³¤ì¤é¤ò Lex & YACC ¤È¸Æ¤Ö¤³¤È¤Ë¤·¤Þ¤¹ - Flex & Bison
65 ¤Ï¡¢Lex & YACC ¤È¾å°Ì¸ß´¹¤Ë¤¢¤ë¤Î¤Ç¡¢Ëܥɥ¥å¥á¥ó¥È¤Î¥µ¥ó¥×¥ë¤â¤½
69 These programs are massively useful, but as with your C compiler, their
70 manpage does not explain the language they understand, nor how to use them.
71 YACC is really amazing when used in combination with Lex, however, the Bison
72 manpage does not describe how to integrate Lex generated code with your
76 ¤³¤ì¤é¤Î¥×¥í¥°¥é¥à¤Ï¡¢Èó¾ï¤ËÍøÍѲÁÃͤι⤤¤â¤Î¤Ç¤¹¡£¤·¤«¤·¡¢C ¥³¥ó¥Ñ¥¤
77 ¥é¤Î man ¥Ú¡¼¥¸¤¬¤½¤¦¤Ç¤¢¤ë¤è¤¦¤Ë¡¢¸À¸ì»ÅÍͤϤâ¤È¤è¤ê»È¤¤Êý¤Ë¤Ä¤¤¤Æ¤¹
78 ¤éËþ¤ʵ½Ò¤¬¤¢¤ê¤Þ¤»¤ó¡£YACC ¤Ï Lex ¤ÈÁȤ߹ç¤ï¤»¤Æ»È¤¦¤È¡¢¹â¤¤¸ú²Ì¤¬
79 ÆÀ¤é¤ì¤ë¤Î¤Ç¤¹¤¬¡¢Bison ¤Î man ¥Ú¡¼¥¸¤Ë¤Ï Lex ¤ÇÀ¸À®¤µ¤ì¤¿¥³¡¼¥É¤È
80 Bison ¤Î¥×¥í¥°¥é¥à¤ò¶¨Ä´Æ°ºî¤µ¤»¤ëÊýË¡¤Ë¤Ä¤¤¤Æ¤Îµ½Ò¤¬¤¢¤ê¤Þ¤»¤ó¡£
84 What this document is NOT
86 Ëܥɥ¥å¥á¥ó¥È¤Ë´Þ¤Þ¤ì¤Ê¤¤¤â¤Î
90 There are several great books which deal with Lex & YACC. By all means
91 read these books if you need to know more. They provide far more information
92 than we ever will. See the 'Further Reading' section at the end. This
93 document is aimed at bootstrapping your use of Lex
94 & YACC, to allow you to create your first programs.
97 Lex & YACC ¤Ë¤Ä¤¤¤Æ¤Ï¡¢¤¤¤¯¤Ä¤«Îɽñ¤¬¤¢¤ê¤Þ¤¹¡£¿¼¤¯ÃΤꤿ¤¤¤Î¤Ç¤¢
98 ¤ì¤Ð¡¢À§ÈóÆɤळ¤È¤ò¤ª´«¤á¤·¤Þ¤¹¡£Ëܥɥ¥å¥á¥ó¥È¤Ë½ñ¤«¤ì¤Æ¤¤¤ë¤è¤ê¡¢¤º¤Ã
99 ¤È¤¿¤¯¤µ¤ó¤Î¾ðÊó¤¬ÆÀ¤é¤ì¤ë¤Ï¤º¤Ç¤¹¡£´¬Ëö¤Î '´ØÏ¢¾ðÊó' ¤Î¾Ï¤ò¤´Í÷¤¯¤À¤µ
100 ¤¤¡£¤³¤³¤Ç¤Ï¡¢ÆɼԤ¬´Êñ¤Ê¥×¥í¥°¥é¥à¤òÁȤá¤ëÄøÅ٤ˡ¢Lex & YACC ¤Î
101 ´ðËÜŪ¤Ê»È¤¤Êý¤ò²òÀ⤹¤ë¤³¤È¤ËÆâÍƤò¤È¤É¤á¤Þ¤¹¡£
104 The documentation that comes with Flex and BISON is also excellent, but no
105 tutorial. They do complement my HOWTO very well though. They too are
106 referenced at the end.
109 Flex ¤È BISON ¤ËÉÕ°¤·¤Æ¤¯¤ë¥É¥¥å¥á¥ó¥È¤âÎɤ¤¤Î¤Ç¤¹¤¬¡¢¥Á¥å¡¼¥È¥ê¥¢¥ë
110 ¤¬¤¢¤ê¤Þ¤»¤ó¡£¤³¤Î HOWTO ¤Î¤ê¤Ê¤¤Éôʬ¤òÊä´°¤¹¤ëʬ¤Ë¤ÏÍÍѤǤ¹¤¬¡£¤³
111 ¤ì¤Ë´Ø¤·¤Æ¤â¡¢´¬Ëö¤Î´ØÏ¢¾ðÊó¤Î¾Ï¤ò¤´Í÷¤¯¤À¤µ¤¤¡£
114 I am by no means a YACC/Lex expert. When I started writing this document, I
115 had exactly two days of experience. All I want to accomplish is to make
116 those two days easier for you.
119 É®¼Ô¤Ï YACC/Lex ¤Î¥¨¥¥¹¥Ñ¡¼¥È¤Ç¤Ï¤¢¤ê¤Þ¤»¤ó¡£¤³¤Î¥É¥¥å¥á¥ó¥È¤ò½ñ¤»Ï
120 ¤á¤¿º¢¤Ç¤â¡¢¤Á¤ç¤¦¤ÉÆóÆü¤Î·Ð¸³¤·¤«¤¢¤ê¤Þ¤»¤ó¤Ç¤·¤¿¡£É®¼Ô¤Î´ê¤¤¤Ï¡¢¤³¤Î
121 ÆóÆü´Ö¤ò³§¤µ¤ó¤Ë¤È¤Ã¤Æ¡¢¾¯¤·¤Ç¤â³Ú¤Ê¤â¤Î¤Ë¤·¤Æ¤¢¤²¤¿¤¤¤È¤¤¤¦¤³¤È¤Ë¿Ô¤
125 In no way expect the HOWTO to show proper YACC and Lex style. Examples
126 have been kept very simple and there may be better ways to write them. If
127 you know how to, please let me know.
130 ¤³¤³¤ËÎ㼨¤µ¤ì¤ë¡¢YACC ¤È Lex ¤Î¥³¡¼¥Ç¥£¥ó¥°¥¹¥¿¥¤¥ë¤¬¾ï¤ËºÇ¤âŬÀڤǤ¢
131 ¤ë¤È¤Ï¸Â¤ê¤Þ¤»¤ó¡£¥³¡¼¥ÉÎã¤Ï¥·¥ó¥×¥ë¤Ë¤¹¤ë¤è¤¦Åؤá¤Þ¤·¤¿¤¬¡¢¤è¤êÎɤ¤½ñ
132 ¤Êý¤¬¤¢¤ë¤«¤â¤·¤ì¤Þ¤»¤ó¡£¤ªµ¤¤Å¤¤ÎÅÀ¤¬¤¢¤ì¤ÐÀ§È󡢤ª¶µ¤¨¤¯¤À¤µ¤¤¡£
142 Please note that you can download all the examples shown, which are in
143 machine readable form. See the <url name="homepage"
144 url="http://ds9a.nl/lex-yacc"> for details.
147 Î㼨¤µ¤ì¤Æ¤¤¤ë¥³¡¼¥É¤Ï¡¢¤¹¤Ù¤Æ machine readable ¤Ê·Á¼°¤Ç¥À¥¦¥ó¥í¡¼¥É¤Ç
148 ¤¤Þ¤¹¡£¾ÜºÙ¤Ï <url name="¥Û¡¼¥à¥Ú¡¼¥¸" url="http://ds9a.nl/lex-yacc">
160 Copyright (c) 2001 by bert hubert. This material may be
161 distributed only subject to the terms and conditions set forth in the Open
162 Publication License, vX.Y or later (the latest version is presently
163 available at http://www.opencontent.org/openpub/).
166 Copyright (c) 2001 by bert hubert. ¤³¤ÎÃøºîʪ¤ÎÇÛÉۤ˴ؤ·¤Æ¤Ï Open
167 Publication License, vX.Y ¤Þ¤¿¤Ï¤½¤ì°Ê¹ß¤ÇÄê¤á¤é¤ì¤Æ¤¤¤ëµ¬Ìó¤È¾ò·ï¤Ë½à
168 µò¤·¤Þ¤¹¡ÊºÇ¿·ÈÇ¤Ï http://www.opencontent.org/openpub/ ¤ÇÆþ¼ê²Äǽ¤Ç¤¹¡Ë¡£
172 What Lex & YACC can do for you
175 Lex ¤È YACC ¤Ç¤Ç¤¤ë¤³¤È
179 When properly used, these programs allow you to parse complex languages with
180 ease. This is a great boon when you want to read a configuration file, or
181 want to write a compiler for any language you (or anyone else) might have
185 ¤³¤ì¤é¤Î¥×¥í¥°¥é¥à¤òÀµ¤·¤¯ÍøÍѤ¹¤ë¤È¡¢´Êñ¤ËÊ£»¨¤Ê¸À¸ì¤Î¹½Ê¸²òÀϤ¬¤Ç¤
186 ¤ë¤è¤¦¤Ë¤Ê¤ê¤Þ¤¹¡£ÆäËÀßÄê¥Õ¥¡¥¤¥ë¤òÆɤ߹þ¤ß¤¿¤¤»þ¤ä¡¢¼«Ê¬¤Þ¤¿¤Ï¾¿Í¤¬
187 ȯ°Æ¤·¤¿¸À¸ìÍѤΥ³¥ó¥Ñ¥¤¥é¤ò½ñ¤¤¿¤¤»þ¤Ê¤É¤Ë¡¢Èó¾ï¤Ë½õ¤±¤È¤Ê¤ê¤Þ¤¹¡£
190 With a little help, which this document will hopefully provide, you will
191 find that you will never write a parser again by hand - Lex & YACC are
192 the tools to do this.
195 ¤³¤Î¥É¥¥å¥á¥ó¥È¤Ç¤Ï¡¢¤ï¤º¤«¤Ê¼ê½õ¤±¤Ë¤·¤«¤Ê¤é¤Ê¤¤¤«¤â¤·¤ì¤Þ¤»¤ó¤¬¡¢¤½
196 ¤ì¤Ç¤â¡¢º£¸å¼êºî¶È¤Ç¹½Ê¸²òÀÏ´ï (Parser) ¤ò½ñ¤¤¤Æ¤ß¤è¤¦¤È¤Ï»×¤ï¤Ê¤¯¤Ê¤ë
197 ¤Ï¤º¤Ç¤¹ - Lex & YACC¤È¤Ï¤½¤Î¤è¤¦¤Êºî¶È¤ò¤·¤Æ¤¯¤ì¤ë¥Ä¡¼¥ë¤Ç¤¹¡£
201 What each program does on its own
204 ¤½¤ì¤¾¤ì¤Î¥×¥í¥°¥é¥à¤Î¤ä¤Ã¤Æ¤¤¤ë¤³¤È
208 Although these programs shine when used together, they each serve a
209 different purpose. The next chapter will explain what each part does.
212 ¤³¤ì¤é¤Î¥×¥í¥°¥é¥à¤Ï¡¢ÁȤ߹ç¤ï¤»¤Æ»È¤¦¤È¤¹¤Ð¤é¤·¤¤¤â¤Î¤Ç¤¹¤¬¡¢¤½¤ì¤¾¤ì
213 ¤Ï°ã¤Ã¤¿ÌÜŪ¤Î¾å¤Ëºî¤é¤ì¤Æ¤¤¤Þ¤¹¡£¼¡¾Ï¤Ç¤Ï¤½¤ì¤¾¤ì¤¬¤ä¤Ã¤Æ¤¤¤ë¤³¤È¤òÀâ
219 The program Lex generates a so called `Lexer'. This is a function that takes
220 a stream of characters as its input, and whenever it sees a group of
221 characters that match a key, takes a certain action. A very simple example:
224 Lex ¥×¥í¥°¥é¥à¤Ï '»ú¶ç²òÀÏ´ï (Lexer)' ¤È¸Æ¤Ð¤ì¤ë¤â¤Î¤òÀ¸À®¤·¤Þ¤¹¡£¤³¤ì
225 ¤ÏÆþÎϤËʸ»úÎ󥹥ȥ꡼¥à¤ò¤È¤ë´Ø¿ô¤Ç¡¢¥¡¼¤Ë¥Þ¥Ã¥Á¤¹¤ëʸ»úÎ󷲤ò¸«¤Ä¤±
226 ¤¿»þ¤Ë¡¢¤¢¤ë·è¤Þ¤Ã¤¿Æ°ºî¤ò¤µ¤»¤ë¤³¤È¤¬¤Ç¤¤Þ¤¹¡£°Ê²¼¤Ï¤½¤Î´Êñ¤ÊÎã¤Ç¤¹¡£
234 stop printf(&dquot;Stop command received\n&dquot;);
235 start printf(&dquot;Start command received\n&dquot;);
241 The first section, in between the %{ and %} pair is included directly in the
242 output program. We need this, because we use printf later on, which is
246 %{ ¤È %} ¤ÎÁȤdzç¤é¤ì¤ëºÇ½é¤Î¥»¥¯¥·¥ç¥ó¤Ï¡¢½ÐÎÏ¥×¥í¥°¥é¥à¤Ç¤ÏľÀÜ¥¤¥ó
247 ¥¯¥ë¡¼¥É¤µ¤ì¤Þ¤¹¡£¤³¤ì¤Ï¡¢stdio.h ¤ÇÄêµÁ¤µ¤ì¤Æ¤¤¤ë printf ¤¬¡¢¸å¤ÇɬÍ×
251 Sections are separated using '%%', so the first line of the second section
252 starts with the 'stop' key. Whenever the 'stop' key is encountered in the
253 input, the rest of the line (a printf() call) is executed.
256 ¥»¥¯¥·¥ç¥ó¤Ï '%%' ¤Ç¶èÀÚ¤é¤ì¡¢Æó¤ÄÌܤΥ»¥¯¥·¥ç¥ó¤ÎÂè°ì¹Ô¤Ï 'stop' ¥¡¼
257 ¤Ç»Ï¤Þ¤ë¤³¤È¤Ë¤Ê¤ê¤Þ¤¹¡£ÆþÎÏ¤Ç 'stop' ¥¡¼¤¬È¯¸½¤·¤¿»þ¤Ï¡¢»Ä¤ê¹Ô (
258 printf() ¸Æ¤Ó½Ð¤·) ¤¬¼Â¹Ô¤µ¤ì¤Þ¤¹¡£
261 Besides 'stop', we've also defined 'start', which otherwise does mostly the
265 &dquot;stop&dquot; ¤Ë²Ã¤¨¤Æ¡¢¤³¤³¤Ç¤Ï &dquot;start&dquot; ¤È¤¤¤¦¤Û¤È¤ó¤ÉƱ¤¸Æ°ºî¤ò¤¹¤ë¤â¤Î¤â
269 We terminate the code section with '%%' again.
272 ¾åµ¤Î¥³¡¼¥É¥»¥¯¥·¥ç¥ó¤ò '%%' ¤ÇÊĤ¸¤Þ¤¹¡£
275 To compile Example 1, do this:
278 Example 1 ¤ò¥³¥ó¥Ñ¥¤¥ë¤¹¤ë¤Ë¤Ï°Ê²¼¤Î¤è¤¦¤Ë¤·¤Þ¤¹¡£
282 cc lex.yy.c -o example1 -ll
288 NOTE: If you are using flex, instead of lex, you may have to change '-ll'
289 to '-lfl' in the compilation scripts. RedHat 6.x and SuSE need this, even when
290 you invoke 'flex' as 'lex'!
295 Ãí°Õ - lex ¤ÎÂå¤ï¤ê¤Ë flex ¤ò»ÈÍѤ·¤Æ¤¤¤ëÊý¤Ï¡¢¥³¥ó¥Ñ¥¤¥ë¥¹¥¯¥ê¥×¥È¤Î'-
296 ll' ¤ò '-lfl' ¤ËÃÖ¤´¹¤¨¤ëɬÍפ¬¤¢¤ë¤«¤â¤·¤ì¤Þ¤»¤ó¡£RedHat 6.x ¤äSuSE
297 ¤Ç¤Ï 'flex' ¤ò 'lex' ¤È¤·¤Æµ¯Æ°¤·¤Æ¤¤¤ë¤«¤â¤·¤ì¤Þ¤»¤ó¤¬¡¢¤³¤ÎÊѹ¹¤¬É¬
302 This will generate the file 'example1'. If you run it, it waits for you to
303 type some input. Whenever you type something that is not matched by any of
304 the defined keys (ie, 'stop' and 'start') it's output again. If you
305 enter 'stop' it will output 'Stop command received';
308 °Ê¾å¤Ë¤è¤ê¡¢'example1' ¤È¤¤¤¦¥Õ¥¡¥¤¥ë¤¬À¸À®¤µ¤ì¤¿¤È»×¤¤¤Þ¤¹¡£¼Â¹Ô¤¹¤ë
309 ¤È¡¢¥¡¼¥Ü¡¼¥É¤«¤é¤ÎÆþÎÏÂÔ¤Á¤Ë¤Ê¤ê¤Þ¤¹¡£ÄêµÁºÑ¤ß¤Î¥¡¼ ( ¨¤Á¡¢'stop'
310 ¤ä 'start') °Ê³°¤Î¤â¤Î¤òÆþÎϤ¹¤ë¤È¡¢¤½¤ì¤¬¤½¤Î¤Þ¤Þ½ÐÎϤµ¤ì¤Þ¤¹¡£'stop'
311 ¤òÆþÎϤ¹¤ë¤È¡¢'Stop command received' ¤¬½ÐÎϤµ¤ì¤Þ¤¹¡£
314 Terminate with a EOF (^D).
317 EOF (^D) ¤Ç¥×¥í¥°¥é¥à¤ò½ªÎ»¤µ¤»¤ë¤³¤È¤¬¤Ç¤¤Þ¤¹¡£
320 You may wonder how the program runs, as we didn't define a main() function.
321 This function is defined for you in libl (liblex) which we compiled in with
325 main() ´Ø¿ô¤âÄêµÁ¤µ¤ì¤Æ¤¤¤Ê¤¤¤Î¤Ë¡¢¤É¤¦¤ä¤Ã¤Æ¥×¥í¥°¥é¥à¤¬Æ°¤¤¤¿¤Î¤«ÉÔ
326 »×µÄ¤Ë»×¤ï¤ì¤¿¤«¤â¤·¤ì¤Þ¤»¤ó¡£¤³¤ì¤Ï¡¢-ll ¥³¥Þ¥ó¥É¤Ç¥³¥ó¥Ñ¥¤¥ë»þ¤Ë¥ê¥ó
327 ¥¯¤·¤¿ libl (liblex) ¤¬¡¢main() ´Ø¿ô¤ÎÄêµÁ¤ò´Þ¤ó¤Ç¤¤¤¿¤«¤é¤Ç¤¹¡£
331 Regular expressions in matches
338 This example wasn't very useful in itself, and our next one won't be either.
339 It will however show how to use regular expressions in Lex, which are
340 massively useful later on.
343 ¾åµ¤ÎÎã¤Ï¡¢¤½¤ì¼«¿È¤Ç¤Ï¤¢¤Þ¤ê»È¤¨¤ë¤â¤Î¤Ç¤Ï¤¢¤ê¤Þ¤»¤ó¤Ç¤·¤¿¡£¼¡¤ÎÎã¤â
344 ¤½¤ì¤Û¤ÉÍøÍѲÁÃͤΤ¢¤ë¤â¤Î¤Ç¤Ï¤Ê¤¤¤Î¤Ç¤¹¤¬¡¢¸å¡¹½ÅÊõ¤¹¤ë¤³¤È¤Ë¤Ê¤ë¡¢
345 Lex ¤Ç¤ÎÀµµ¬É½¸½¤Î»È¤¤Êý¤òÎ㼨¤·¤Æ¤¤¤Þ¤¹¡£
354 [0123456789]+ printf(&dquot;NUMBER\n&dquot;);
355 [a-zA-Z][a-zA-Z0-9]* printf(&dquot;WORD\n&dquot;);
361 This Lex file describes two kinds of matches (tokens): WORDs and NUMBERs.
362 Regular expressions can be pretty daunting but with only a little work it is
363 easy to understand them. Let's examine the NUMBER match:
366 ¤³¤Î Lex ¥Õ¥¡¥¤¥ë¤Ç¤Ï WORD ¤È NUMBER ¤È¤¤¤¦¡¢Æó¼ïÎà¤Î¥Þ¥Ã¥Á¡Ê¥È¡¼¥¯¥ó¡Ë
367 ¤òµ½Ò¤·¤Æ¤¤¤Þ¤¹¡£Àµµ¬É½¸½¤Èʹ¤¯¤È¤Ó¤¯¤Ä¤¤¤Æ¤·¤Þ¤¦¿Í¤â¤¤¤ë¤«¤â¤·¤ì¤Þ¤»
368 ¤ó¤¬¡¢¤Á¤ç¤Ã¤ÈÊÙ¶¯¤¹¤ì¤Ð¤¹¤°¤ËÍý²ò¤Ç¤¤ë¤è¤¦¤Ë¤Ê¤ë¤â¤Î¤Ç¤¹¡£NUMBER ¤Ë
369 ÂФ¹¤ë¥Þ¥Ã¥Á¤ò¸«¤Æ¤ß¤Þ¤·¤ç¤¦¡£
374 This says: a sequence of one or more characters from the group 0123456789.
375 We could also have written it shorter as:
378 ¤³¤ì¤Ï¡¢0123456789 ¤Î¤É¤ì¤«°ìʸ»ú¤ò´Þ¤àʸ»ú¡¢¤Þ¤¿¤Ïʸ»úÎó¤¬Â¸ºß¤¹¤ë¤È
379 ¤¤¤¦°ÕÌ£¤Ç¤¹¡£°Ê²¼¤Î¤è¤¦¤Ê´Êάɽµ¤â¤Ç¤¤Þ¤¹¡£
384 Now, the WORD match is somewhat more involved:
387 WORD ¥Þ¥Ã¥Á¤Ï¤â¤¦¾¯¤·Ê£»¨¤Ë¤Ê¤ê¤Þ¤¹¡£
392 The first part matches 1 and only 1 character that is between 'a' and 'z',
393 or between 'A' and 'Z'. In other words, a letter. This initial letter then
394 needs to be followed by zero or more characters which are either a letter or
395 a digit. Why use an asterisk here? The '+' signifies 1 or more matches, but
396 a WORD might very well consist of only one character, which we've already
397 matched. So the second part may have zero matches, so we write a '*'.
400 Á°È¾Éôʬ¤Ï¡¢'a' ¤«¤é 'z' ¤Þ¤¿¤Ï 'A' ¤«¤é 'Z' ¤Î´Ö¤Îʸ»úÎ󡢤Ĥޤꥢ¥ë
401 ¥Õ¥¡¥Ù¥Ã¥È¤Î¤É¤ì¤«¤È¤¤¤¦°ÕÌ£¤Ç¤¹¡£¥¢¥ë¥Õ¥¡¥Ù¥Ã¥È¤Î¸å¤Ë¤Ï¡¢¥¢¥ë¥Õ¥¡¥Ù¥Ã
402 ¥È¤â¤·¤¯¤Ï¥¢¥é¥Ó¥¢¿ô»ú¤¬¥¼¥í¸Ä°Ê¾å³¤¤Þ¤¹¡£¥¢¥¹¥¿¥ê¥¹¥¯¤ò»È¤Ã¤Æ¤¤¤ë¤Î
403 ¤Ï²¿¸Î¤Ç¤·¤ç¤¦¡© '+' ¤È¤¤¤¦¤Î¤Ï°ì¸Ä°Ê¾å¤Î¥Þ¥Ã¥Á¤òɽ¤·¤Þ¤¹¤¬¡¢WORD ¤Ï¡¢
404 Á°È¾Éôʬ¤Ç´û¤Ë¥Þ¥Ã¥Á¤·¤¿°ìʸ»ú¤Î¤ß¤È¤¤¤¦²ÄǽÀ¤â¤¢¤ê¤Þ¤¹¡£¤½¤Î¾ì¹ç¤Ë¤Ï¡¢
405 ¸åȾÉôʬ¤Ç¤Î¥Þ¥Ã¥Á¤¬¥¼¥í¤Ë¤Ê¤Ã¤Æ¤·¤Þ¤¦¤Î¤Ç¡¢'*' ¤È¤¹¤ëɬÍפ¬¤¢¤ë¤Î¤Ç¤¹¡£
408 This way, we've mimicked the behaviour of many programming languages which
409 demand that a variable name *must* start with a letter, but can contain
410 digits afterwards. In other words, 'temperature1' is a valid name,
411 but '1temperature' is not.
414 ¤³¤Î¤è¤¦¤Ë¤·¤Æ¡¢Â¿¤¯¤Î¥×¥í¥°¥é¥ß¥ó¥°¸À¸ì¤¬Í׵᤹¤ë¤è¤¦¤Ê¡¢ºÇ½é¤Îʸ»ú¤¬
415 ¥¢¥ë¥Õ¥¡¥Ù¥Ã¥È¤Ç »Ï¤Þ¤é¤Ê¤¯¤Æ¤Ï *¤Ê¤é¤º*¡¢¤½¤Î¸å¤Ï¥¢¥é¥Ó¥¢¿ô»ú¤ò´Þ¤ó¤Ç
416 ¤âÎɤ¤¤È¤¤¤¦¤è¤¦¤Ê¡¢ÊÑ¿ô̾¤Îµ¬Â§¤Ë»÷¤»¤¿¤â¤Î¤òºî¤ë¤³¤È¤¬¤Ç¤¤Þ¤·¤¿¡£¤Ä
417 ¤Þ¤ê¡¢'temperature1' ¤ÏÎɤ¤¤Ç¤¹¤¬¡¢'1temperature' ¤Ï¤À¤á¤È¤¤¤¦¤³¤È¤Ë¤Ê
421 Try compiling Example 2, lust like Example 1, and feed it some text. Here is
425 Example 1 ¤Ç¤ä¤Ã¤¿¤è¤¦¤Ë¡¢Example 2 ¤ò¥³¥ó¥Ñ¥¤¥ë¤·¤Æ¤ß¤Æ¤¯¤À¤µ¤¤¡£¤½¤ì¤«
426 ¤é°Ê²¼¤ÎÎã¤Î¤è¤¦¤Ë¥Æ¥¥¹¥È¤òÆþÎϤ·¤Æ¤ß¤Æ¤¯¤À¤µ¤¤¡£
448 You may also be wondering where all this whitespace is coming from in the
449 output. The reason is simple: it was in the input, and we don't match on it
450 anywhere, so it gets output again.
453 ½ÐÎϤΥۥ磻¥È¥¹¥Ú¡¼¥¹¤¬¤É¤³¤«¤éÍ褿¤Î¤«¡¢ÉԻ׵Ĥ˻פï¤ì¤¿¤«¤â¤·¤ì¤Þ¤»
454 ¤ó¡£Íýͳ¤Ï´Êñ¤Ç¤¹¡£¤³¤ì¤é¤Ï¡¢¤â¤È¤â¤ÈÆþÎϤ˴ޤޤì¤Æ¤¤¤¿¤â¤Î¤Ç¤¹¤¬¡¢¥Þ¥Ã
455 ¥Á¤·¤Ê¤¤¤¿¤á¤½¤Î¤Þ¤Þ½ÐÎϤȤʤäƸ½¤ì¤¿¤È¤¤¤¦¤À¤±¤ÎÏäǤ¹¡£
458 The Flex manpage documents its regular expressions in detail. Many people
459 feel that the perl regular expression manpage (perlre) is also very useful,
460 although Flex does not implement everything perl does.
463 Flex ¤Î man ¥Ú¡¼¥¸¤Ë¤Ï¡¢»È¤ï¤ì¤Æ¤¤¤ëÀµµ¬É½¸½¤Ë¤Ä¤¤¤Æ¾Ü¤·¤¯ºÜ¤Ã¤Æ¤¤¤Þ¤¹¡£
464 ¤Þ¤¿¡¢perl ¤ÎÀµµ¬É½¸½¤Î man ¥Ú¡¼¥¸ (perlre) ¤òÊØÍø¤À¤È´¶¤¸¤é¤ì¤ëÊý¡¹¤â
465 ¤¿¤¯¤µ¤ó¤¤¤Þ¤¹ - ¤â¤Ã¤È¤â Flex ¤ÎÀµµ¬É½¸½¤Î¼ÂÁõ¤Ï¡¢perl ¤Û¤É´°Á´¤Ç¤Ï¤¢
469 Make sure that you do not create zero length matches like '[0-9]*' - your
470 lexer might get confused and start matching empty strings repeatedly.
473 &dquot;[0-9]*&dquot; ¤Î¤è¤¦¤Ë¡¢Ä¹¤µ¥¼¥í¤Î¥Þ¥Ã¥Á¤Ï¹Ô¤ï¤Ê¤¤¤è¤¦¤ËÃí°Õ¤·¤Æ¤¯¤À¤µ¤¤¡£
474 »ú¶ç²òÀϴ郎º®Í𤷤Ƥ·¤Þ¤¤¡¢¶õʸ»úÎó¤È¤Î¥Þ¥Ã¥Á¤ò·«¤êÊÖ¤¹¤è¤¦¤Ê¤³¤È¤Ë¤Ê
480 A more complicated example for a C like syntax
483 C ¤Î¤è¤¦¤Ê¥·¥ó¥¿¥Ã¥¯¥¹¤ò¤â¤Ä¤â¤¦¾¯¤·¹âÅÙ¤ÊÎã
488 Let's say we want to parse a file that looks like this:
491 °Ê²¼¤Î¤è¤¦¤ÊÀßÄê¥Õ¥¡¥¤¥ë¤ò¹½Ê¸²òÀϤ·¤¿¤¤¤È¤·¤Þ¤¹¡£
495 category lame-servers { null; };
496 category cname { null; };
499 zone &dquot;.&dquot; {
501 file &dquot;/etc/bind/db.root&dquot;;
506 We clearly see a number of categories (tokens) in this file:
509 ¤³¤Î¥Õ¥¡¥¤¥ëÆâ¤Ë¤Ï¤¤¤¯¤Ä¤â¤Î¥«¥Æ¥´¥ê¡Ê¥È¡¼¥¯¥ó¡Ë¤¬¤¢¤ë¤Î¤¬¤ï¤«¤ê¤Þ¤¹¡£
513 <item> WORDs, like 'zone' and 'type'
514 <item> FILENAMEs, like '/etc/bind/db.root'
515 <item> QUOTEs, like those surrounding the filename
521 <item> &dquot;zone&dquot; ¤ä &dquot;type&dquot; ¤Ê¤É¤Î WORD
522 <item> &dquot;/etc/bind/db.root&dquot; ¤Ê¤É¤Î FILENAME
523 <item> ¥Õ¥¡¥¤¥ë¥Í¡¼¥à¤ò³ç¤Ã¤Æ¤¤¤ë QUOTE
524 <item> { ¤òɽ¤¹ OBRACE
525 <item> } ¤òɽ¤¹ EBRACE
526 <item> ; ¤òɽ¤¹ SEMICOLON
530 The corresponding Lex file is Example 3:
533 Âбþ¤¹¤ë Lex ¥Õ¥¡¥¤¥ë¤Ï Example 3 ¤Î¤è¤¦¤Ë¤Ê¤ê¤Þ¤¹¡£
542 [a-zA-Z][a-zA-Z0-9]* printf(&dquot;WORD &dquot;);
543 [a-zA-Z0-9\/.-]+ printf(&dquot;FILENAME &dquot;);
544 \&dquot; printf(&dquot;QUOTE &dquot;);
545 \{ printf(&dquot;OBRACE &dquot;);
546 \} printf(&dquot;EBRACE &dquot;);
547 ; printf(&dquot;SEMICOLON &dquot;);
548 \n printf(&dquot;\n&dquot;);
549 [ \t]+ /* ignore whitespace */;
560 [a-zA-Z][a-zA-Z0-9]* printf(&dquot;WORD &dquot;);
561 [a-zA-Z0-9\/.-]+ printf(&dquot;FILENAME &dquot;);
562 \&dquot; printf(&dquot;QUOTE &dquot;);
563 \{ printf(&dquot;OBRACE &dquot;);
564 \} printf(&dquot;EBRACE &dquot;);
565 ; printf(&dquot;SEMICOLON &dquot;);
566 \n printf(&dquot;\n&dquot;);
567 [ \t]+ /* ¥Û¥ï¥¤¥È¥¹¥Ú¡¼¥¹¤Ï̵»ë */;
572 When we feed our file to the program this Lex file generates (using
573 example3.compile), we get:
576 ¥×¥í¥°¥é¥à¤ËÀßÄê¥Õ¥¡¥¤¥ë¤òÆþÎϤ¹¤ë¤È¡¢¤³¤Î Lex ¥Õ¥¡¥¤¥ë¤«¤é
577 ¡Êexample3.compile ¤ò»È¤Ã¤Æ)°Ê²¼¤Î¤è¤¦¤Ê½ÐÎϤ¬ÆÀ¤é¤ì¤Þ¤¹¡£
581 WORD FILENAME OBRACE WORD SEMICOLON EBRACE SEMICOLON
582 WORD WORD OBRACE WORD SEMICOLON EBRACE SEMICOLON
585 WORD QUOTE FILENAME QUOTE OBRACE
587 WORD QUOTE FILENAME QUOTE SEMICOLON
592 When compared with the configuration file mentioned above, it is clear that
593 we have neatly 'Tokenized' it. Each part of the configuration file has been
594 matched, and converted into a token.
597 ÀßÄê¥Õ¥¡¥¤¥ë¤È¸«Èæ¤Ù¤ë¤È¡¢Å¬ÀÚ¤Ë '¥È¡¼¥¯¥ó²½' ¤µ¤ì¤¿¤Î¤¬¤ï¤«¤ê¤Þ¤¹¡£¥Õ¥¡
598 ¥¤¥ë¤Î³Æ¡¹¤ÎÉôʬ¤Ç¡¢Àµµ¬É½¸½¤Ë¤è¤ë¥Þ¥Ã¥Á¤¬¤È¤é¤ì¡¢¥È¡¼¥¯¥ó¤ËÊÑ´¹¤µ¤ì¤Æ
602 And this is exactly what we need to put YACC to good use.
605 ¤³¤ì¤¬¡¢YACC ¤ò³èÍѤ¹¤ë¤¿¤á¤ËɬÍפʤ³¤È¤Ê¤Î¤Ç¤¹¡£
616 We've seen that Lex is able to read arbitrary input, and determine what each
617 part of the input is. This is called 'Tokenizing'.
620 Lex ¤ÏǤ°Õ¤ÎÆþÎϤ«¤é¡¢¤½¤ì¤¾¤ì¤ÎÉôʬ¤¬²¿¤Ç¤¢¤ë¤«·èÄꤹ¤ë¤³¤È¤¬¤Ç¤¤ë¡¢
621 ¤È¤¤¤¦¤³¤È¤¬¤ï¤«¤ê¤Þ¤·¤¿¡£¤³¤ì¤ò '¥È¡¼¥¯¥ó²½¤¹¤ë' ¤È¤¤¤¤¤Þ¤¹¡£
626 YACC can parse input streams consisting of tokens with certain values. This
627 clearly describes the relation YACC has with Lex, YACC has no idea
628 what 'input streams' are, it needs preprocessed tokens. While you can write your
629 own Tokenizer, we will leave that entirely up to Lex.
632 YACC ¤Ï¡¢¤¢¤ëÃͤò¤â¤Ä¥È¡¼¥¯¥ó¤«¤é¹½À®¤µ¤ì¤ë¡¢ÆþÎÏ¥¹¥È¥ê¡¼¥à¤Î¹½Ê¸²òÀÏ
633 ¤ò¤¹¤ë¤³¤È¤¬¤Ç¤¤Þ¤¹¡£¤³¤Î¤³¤È¤Ï¡¢Lex ¤ËÂФ¹¤ë YACC ¤Î´Ø·¸¤ò¡¢¤Ï¤Ã¤¤ê
634 ¤È¼¨¤·¤Æ¤¤¤Þ¤¹¡£YACC ¤Ï¤½¤â¤½¤â 'ÆþÎÏ¥¹¥È¥ê¡¼¥à' ¤È¤¤¤¦¤â¤Î¤¬²¿¤Ç¤¢¤ë
635 ¤«¤òÍý²ò¤·¤Æ¤ª¤é¤º¡¢¥È¡¼¥¯¥ó²½¤µ¤ì¤¿ÆþÎϤòɬÍפȤ·¤Þ¤¹¡£¤´¼«¿È¤Ç»ú¶ç²ò
636 ÀÏ¥×¥í¥°¥é¥à¤ò½ñ¤«¤ì¤Æ¤âÎɤ¤¤Ç¤¹¤¬¡¢¤³¤³¤Ç¤Ï¤½¤ì¤Ï Lex ¤Ë¾ù¤ë¤³¤È¤Ë¤·
640 A note on grammars and parsers. When YACC saw the light of day, the tool was
641 used to parse input files for compilers: programs. Programs written in a
642 programming language for computers are typically *not* ambiguous - they have
643 just one meaning. As such, YACC does not cope with ambiguity and will
644 complain about shift/reduce or reduce/reduce conflicts. More about
645 ambiguity and YACC &dquot;problems&dquot; can be found in 'Conflicts' chapter.
648 ʸˡ¤È¹½Ê¸²òÀÏ´ï¤Ë¤Ä¤¤¤Æ¡¢Ê䤷¤Æ¤ª¤¤Þ¤¹¡£YACC ¤Ï¡¢Åо줷¤¿¤Æ¤Îº¢¤Ï
649 ¥³¥ó¥Ñ¥¤¥é¤Ø¤ÎÆþÎÏ¥Õ¥¡¥¤¥ë - ¤Ä¤Þ¤ê¥×¥í¥°¥é¥à- ¤Î¹½Ê¸²òÀϤ˻Ȥï¤ì¤Æ¤¤
650 ¤Þ¤·¤¿¡£¥³¥ó¥Ô¥å¡¼¥¿¸þ¤±¤Î¥×¥í¥°¥é¥ß¥ó¥°¸À¸ì¤Ç½ñ¤«¤ì¤¿¥×¥í¥°¥é¥à¤Ï¡¢ÄÌ
651 ¾ïÛ£Ëæ¤Ê¤È¤³¤í¤Ï *¤Ê¤¯*¡¢°ÕÌ£¤â°ì¤Ä¤Ë¸Â¤é¤ì¤Æ¤¤¤Þ¤¹¡£½¾¤Ã¤Æ¡¢YACC ¤ÏÛ£
652 Ë椵¤òµöÍƤǤ¤º¡¢shift/reduce ¤ä reduce/reduce ¥³¥ó¥Õ¥ê¥¯¥È¤Ê¤É¤Î·Ù¹ð
653 ¤ä¥¨¥é¡¼¤ò½Ð¤·¤Þ¤¹¡£Û£Ë椵¤È YACC ÆÃͤΠ&dquot;ÌäÂêÅÀ&dquot; ¤Ë¤Ä¤¤¤Æ
654 ¤Ï¡¢'¥³¥ó¥Õ¥ê¥¯¥È' ¤Î¾Ï¤ò¤´Í÷¤¯¤À¤µ¤¤¡£
658 A simple thermostat controller
664 Let's say we have a thermostat that we want to control using a simple
665 language. A session with the thermostat may look like this:
668 ñ½ã¤Ê¸À¸ì¤ò»È¤Ã¤ÆÀ©¸æ¤Ç¤¤ë²¹ÅÙÄ´Àá´ï¤¬¤¢¤ë¤È¤·¤Þ¤¹¡£¤³¤Î²¹ÅÙÄ´Àá´ï¤ò
669 »È¤Ã¤¿¤ä¤ê¤È¤ê¤Ï°Ê²¼¤Î¤è¤¦¤Ë¤Ê¤ê¤Þ¤¹¡£
676 target temperature 22
681 The tokens we need to recognize are: heat, on/off (STATE), target, temperature,
685 ǧ¼±¤·¤Ê¤¯¤Æ¤Ï¤Ê¤é¤Ê¤¤¥È¡¼¥¯¥ó¤Ï¡¢heat, on/off(STATE), target,
686 temperature, NUMBER ¤Ç¤¹¡£
689 The Lex tokenizer (Example 4) is:
692 ¤³¤Î»ú¶ç²òÀÏ´ï¤ò Lex ¤Çºî¤ë¤È (Example 4)
698 #include &dquot;y.tab.h&dquot;
701 [0-9]+ return NUMBER;
704 target return TOKTARGET;
705 temperature return TOKTEMPERATURE;
706 \n /* ignore end of line */;
707 [ \t]+ /* ignore whitespace */;
715 #include &dquot;y.tab.h&dquot;
718 [0-9]+ return NUMBER;
721 target return TOKTARGET;
722 temperature return TOKTEMPERATURE;
723 \n /* ²þ¹Ô¤Ï̵»ë */;
724 [ \t]+ /* ¥Û¥ï¥¤¥È¥¹¥Ú¡¼¥¹¤Ï̵»ë */;
729 We note two important changes. First, we include the file 'y.tab.h', and
730 secondly, we no longer print stuff, we return names of tokens. This change
731 is because we are now feeding it all to YACC, which isn't interested in
732 what we output to the screen. Y.tab.h has definitions for these tokens.
735 Æó¤ÄÂ礤ʰ㤤¤¬¤¢¤ê¤Þ¤¹¡£°ì¤ÄÌÜ¤Ï 'y.tab.h' ¤ò¥¤¥ó¥¯¥ë¡¼¥É¤·¤Æ¤¤¤ë¤³
736 ¤È¤Ç¤¹¡£Æó¤ÄÌܤϡ¢print ½ÐÎϤ¹¤ë¤Î¤ò¤ä¤á¤Æ¥È¡¼¥¯¥ó̾¤òÊÖ¤¹¤è¤¦¤Ë¤·¤Æ¤¤
737 ¤ë¤È¤¤¤¦¤³¤È¤Ç¤¹¡£¤³¤ì¤Ï¡¢Lex ¤Î½ÐÎϤòÁ´¤Æ YACC ¤ËÆþÎϤ·¤è¤¦¤È¤·¤Æ¤¤¤ë
738 ¤«¤é¤Ç¡¢¥¹¥¯¥ê¡¼¥ó¤Ëɽ¼¨¤¹¤ë°ÕÌ£¤¬¤Ê¤¤¤«¤é¤Ç¤¹¡£y.tab.h ¤Ç¤Ï¥È¡¼¥¯¥ó¤Î
742 But where does y.tab.h come from? It is generated by YACC from the Grammar
743 File we are about to create. As our language is very basic, so is the grammar:
746 y.tab.h ¤Ï¤É¤³¤«¤é½Ð¤ÆÍ褿¤Î¤Ç¤·¤ç¤¦¡©¤³¤ì¤Ï¡¢¸å¤Çºî¤ë¤³¤È¤Ë¤Ê¤ëʸˡ¥Õ¥¡
747 ¥¤¥ë¤«¤é YACC ¤¬À¸À®¤·¤¿¤â¤Î¤Ç¤¹¡£¸À¸ì¤ÈƱÍÍ¡¢Ê¸Ë¡¤âÈó¾ï¤Ëñ½ã¤Ë¤Ê¤Ã¤Æ
751 commands: /* empty */
764 printf(&dquot;\tHeat turned on or off\n&dquot;);
769 TOKTARGET TOKTEMPERATURE NUMBER
771 printf(&dquot;\tTemperature set\n&dquot;);
777 The first part is what I call the 'root'. It tells us that we
778 have 'commands', and that these commands consist of individual 'command'
779 parts. As you can see this rule is very recursive, because it again
780 contains the word 'commands'. What this means is that the program is now
781 capable of reducing a series of commands one by one. Read the chapter 'How
782 do Lex and YACC work internally' for important details on recursion.
785 ÀèƬ¤ÎÉôʬ¤ò¡¢¤³¤³¤Ç¤Ï 'root' ¤È¸Æ¤Ö¤³¤È¤Ë¤·¤Þ¤¹¡£¤³¤ì¤Ï 'commands' ¤È
786 ¤¤¤¦¤â¤Î¤¬ÄêµÁ¤µ¤ì¤Æ¤¤¤Æ¡¢¤½¤ì¤¬¸ÄÊ̤Π'command' ¤«¤é¹½À®¤µ¤ì¤Æ¤¤¤ë¤È
787 ¤¤¤¦¤³¤È¤ò¼¨¤·¤Æ¤¤¤Þ¤¹¡£¥³¥Þ¥ó¥É¤¬¤µ¤é¤Ë¥³¥Þ¥ó¥É¤ò´Þ¤ó¤Ç¤¤¤ë¤³¤È¤«¤é¡¢
788 ¤³¤Îµ¬Â§¤ÏºÆµ¢Åª¤Ç¤¢¤ë¤È¸À¤¨¤Þ¤¹¡£¤³¤ì¤Ï¤Þ¤¿¡¢¹½Ê¸²òÀϴ郎Ϣ³¤¹¤ë¥³¥Þ
789 ¥ó¥É¤ò°ì¤Ä¤º¤Ä´Ô¸µ¤Ç¤¤ë¤è¤¦¤Ë¤Ê¤Ã¤¿¡¢¤È¤¤¤¦¤³¤È¤â°ÕÌ£¤·¤Æ¤¤¤Þ¤¹¡£ºÆµ¢
790 ¤Ë¤Ä¤¤¤Æ¤Ï¡¢'Lex ¤È YACC ¤ÎÆâÉôÆ°ºî' ¤Î¾Ï¤Ë½ÅÍפʵ½Ò¤¬¤¢¤ê¤Þ¤¹¡£
793 The second rule defines what a command is. We support only two kinds of
794 commands, the 'heat_switch' and the 'target_set'. This is what the |-symbol
795 signifies - 'a command consists of either a heat_switch or a target_set'.
798 ¤½¤Î¼¡¤Ï¡¢¥³¥Þ¥ó¥É¤òÄêµÁ¤¹¤ëµ¬Â§¤Ç¤¹¡£¤³¤³¤Ç¤Ï¡¢'heat_switch' ¤È 'target_set'
799 ¤È¤¤¤¦Æó¼ïÎà¤Î¤ß¥µ¥Ý¡¼¥È¤·¤Þ¤¹¡£¤³¤ì¤Ï | µ¹æ¤Çɽ¤µ¤ì¡¢'
800 ¥³¥Þ¥ó¥É¤¬ heat_switch ¤Þ¤¿¤Ï target_set ¤«¤éÀ®¤ë' ¤³¤È¤ò¼¨¤·¤Æ¤¤¤Þ¤¹¡£
803 A heat_switch consists of the HEAT token, which is simply the word 'heat',
804 followed by a state (which we defined in the Lex file as 'on' or 'off').
807 heat_switch ¤Ï¡¢Ã±¤Ë 'heat' ¤È¤¤¤¦Ã±¸ì¤ò»Ø¤¹ HEAT ¥È¡¼¥¯¥ó¤Ë¡¢¾õÂÖ
808 (Lex ¥Õ¥¡¥¤¥ë¤Ç 'on' ¤ä 'off' ¤È¤·¤ÆÄêµÁºÑ¤ß¡Ë¤òÉղä·¤¿¤â¤Î¤Ç¤¹¡£
811 Somewhat more complicated is the target_set, which consists of the TARGET
812 token (the word 'target'), the TEMPERATURE token (the word 'temperature')
816 target_set ¤Ï¤â¤¦¾¯¤·Ê£»¨¤Ç¡¢¤³¤ì¤Ï TARGET ¥È¡¼¥¯¥ó ('target' ¤È¤¤¤¦Ã±
817 ¸ì)¡¢TEMPERATURE ¥È¡¼¥¯¥ó ('temperature' ¤È¤¤¤¦Ã±¸ì) ¤½¤·¤Æ¿ô»ú¤«¤é¹½
829 The previous section only showed the grammar part of the YACC file, but
830 there is more. This is the header that we omitted:
833 Á°¤Î¥»¥¯¥·¥ç¥ó¤Ç¤Ï¡¢YACC ¤ÎʸˡÉôʬ¤À¤±¤Ç¤·¤¿¤¬¡¢¤â¤¦¾¯¤·²òÀ⤷¤Æ¤ª¤¯
834 ¤³¤È¤¬¤¢¤ê¤Þ¤¹¡£°Ê²¼¤Ï¾Êά¤·¤¿¥Ø¥Ã¥À¤ÎÉôʬ¤Ç¤¹¡£
841 void yyerror(const char *str)
843 fprintf(stderr,&dquot;error: %s\n&dquot;,str);
858 %token NUMBER TOKHEAT STATE TOKTARGET TOKTEMPERATURE
862 The yyerror() function is called by YACC if it finds an error. We simply
863 output the message passed, but there are smarter things to do. See
864 the 'Further reading' section at the end.
867 yyerror() ´Ø¿ô¤Ï¥¨¥é¡¼¤¬¸«¤Ä¤«¤Ã¤¿»þ¤Ë¡¢YACC ¤«¤é¸Æ¤Ð¤ì¤Þ¤¹¡£¤³¤³¤Ç¤Ï
868 ñ¤ËÍ¿¤¨¤é¤ì¤¿¥á¥Ã¥»¡¼¥¸¤ò½ÐÎϤ·¤Þ¤¹¤¬¡¢¤â¤¦¾¯¤·¸¤¤¤³¤È¤â¤Ç¤¤Þ¤¹¡£´¬
869 Ëö¤Î'´ØÏ¢½ñÀÒ'¤Î¾Ï¤ò¤´Í÷¤¯¤À¤µ¤¤¡£
872 The function yywrap() can be used to continue reading from another file. It
873 is called at EOF and you can than open another file, and return 0. Or you
874 can return 1, indicating that this is truly the end. For more about this,
875 see the 'How do Lex and YACC work internally' chapter.
878 yywrap() ´Ø¿ô¤Ï¡¢Ï¢Â³¤·¤Æ¾¤Î¥Õ¥¡¥¤¥ë¤«¤éÆɤß³¤±¤ë¤Î¤Ë»È¤ï¤ì¤Þ¤¹¡£
879 EOF ¤Ç¸Æ¤Ð¤ì¡¢¤â¤¦°ì¤Ä¤Î¥Õ¥¡¥¤¥ë¤ò¥ª¡¼¥×¥ó¤·¤¿¸å 0 ¤òÊÖ¤·¤Þ¤¹¡£¤Þ¤¿¤Ï
880 1 ¤òÊÖ¤·¤Æ¡¢¤â¤¦Æɤà¤Ù¤¥Õ¥¡¥¤¥ë¤Ï¤Ê¤¤¤È¤¤¤¦¤³¤È¤òÄÌÃΤ·¤Þ¤¹¡£¾Ü¤·¤¯¤Ï'
881 Lex ¤È YACC ¤ÎÆâÉôÆ°ºî'¤Î¾Ï¤ò¤´Í÷¤¯¤À¤µ¤¤¡£
884 Then there is the main() function, that does nothing but set everything in
888 ¤½¤ì¤«¤é main() ´Ø¿ô¤¬¤¢¤ê¤Þ¤¹¤¬¡¢¤³¤ì¤Ï¥×¥í¥°¥é¥à¤òµ¯Æ°¤¹¤ë¤È¤¤¤¦°Ê³°¤Î¤³
889 ¤È¤Ï²¿¤â¤·¤Æ¤¤¤Þ¤»¤ó¡£
892 The last line simply defines the tokens we will be using. These are output
893 using y.tab.h if YACC is invoked with the '-d' option.
896 ºÇ½ª¹Ô¤Ï¡¢Ã±¤Ë»ÈÍѤ¹¤ë¥È¡¼¥¯¥ó¤òÄêµÁ¤·¤Æ¤¤¤ë¤À¤±¤Ç¤¹¡£¤³¤ì¤é¤Ï YACC ¤ò
897 -d ¥ª¥×¥·¥ç¥ó¤Ç¼Â¹Ô¤·¤¿»þ¤Ë¼«Æ°À¸À®¤µ¤ì¤ë y.tab.h ¤«¤éÆÀ¤é¤ì¤Þ¤¹¡£
902 Compiling & running the thermostat controller
905 ²¹ÅÙÄ´Àá´ï¤Î¥³¥ó¥Ñ¥¤¥ë¤Èµ¯Æ°
911 cc lex.yy.c y.tab.c -o example4
916 A few things have changed. We now also invoke YACC to compile our grammar,
917 which creates y.tab.c and y.tab.h. We then call Lex as usual. When
918 compiling, we remove the -ll flag: we now have our own main() function and
919 don't need the one provided by libl.
922 ¤¤¤¯¤Ä¤«°ÊÁ°¤È°ã¤¦ÅÀ¤¬¤¢¤ê¤Þ¤¹¡£YACC ¤ò»È¤Ã¤Æʸˡ¥Õ¥¡¥¤¥ë¤ò¥³¥ó¥Ñ¥¤¥ë
923 ¤¹¤ë¤³¤È¤Ç¡¢y.tab.c ¤È y.tab.h ¤òÀ¸À®¤·¤Æ¤¤¤Þ¤¹¡£¤½¤ì¤«¤éÉáÄÌ¤Ë Lex ¤ò
924 ¸Æ¤Ó½Ð¤·¤Æ¤¤¤Þ¤¹¡£¥³¥ó¥Ñ¥¤¥ë¤¹¤ë»þ¤Ï -ll ¥Õ¥é¥°¤ò³°¤·¤Æ¤¯¤À¤µ¤¤¡£¤³
925 ¤³¤Ç¤Ïmain() ´Ø¿ô¤òÄêµÁ¤·¤Æ¤¤¤ë¤Î¤Ç¡¢libl ¤ÇÄ󶡤µ¤ì¤ë¤â¤Î¤ò»È¤¦É¬Íפ¬
930 NOTE: if you get an error about your compiler not being able to
931 find 'yylval', add this to example4.l, just beneath #include
934 extern YYSTYPE yylval;
936 This is explained in the 'How Lex and YACC work internally' section.
941 Ãí°Õ - ¥³¥ó¥Ñ¥¤¥é¤¬ 'yylval' ¤¬¸«¤Ä¤«¤é¤Ê¤¤¤È¤¤¤¦¥¨¥é¡¼¤ò½Ð¤¹¾ì¹ç¤Ï¡¢
942 example4.l ¤Î #include <y.tab.h> ¤Îľ¸å¤Ë¡¢°Ê²¼¤òµ½Ò¤·¤Æ¤¯¤À¤µ
945 extern YYSTYPE yylval;
947 ¤³¤ì¤Ë¤Ä¤¤¤Æ¤Ï 'Lex ¤È YACC ¤ÎÆâÉôÆ°ºî' ¤Î¾Ï¤ËÀâÌÀ¤µ¤ì¤Æ¤¤¤Þ¤¹¡£
954 °Ê²¼¤Ï¡¢´Êñ¤ÊÆ°ºîÎã¤Ç¤¹¡£
959 Heat turned on or off
961 Heat turned on or off
962 target temperature 10
970 This is not quite what we set out to achieve, but in the interest of keeping
971 the learning curve manageable, not all cool stuff can be presented at once.
974 ËÜÅö¤Ë¤ä¤ê¤¿¤«¤Ã¤¿¤³¤È¤È¤Ï¿¾¯¤º¤ì¤Æ¤¤¤Þ¤¹¤¬¡¢ÌµÍý¤Î¤Ê¤¤³Ø½¬¶ÊÀþ¤òé¤ë
975 ¤È¤¤¤¦°ÕÌ£¤Ç¤â¡¢¤³¤³¤Ç¤«¤Ã¤³¤¤¤¤¥³¡¼¥É¤ä¥Æ¥¯¥Ë¥Ã¥¯¤ò¤¤¤Ã¤Ú¤ó¤Ë¾Ò²ð¤¹¤ë
981 Expanding the thermostat to handle parameters
984 °ú¿ô¤ò°·¤¨¤ë¤è¤¦¤Ë³ÈÄ¥¤·¤¿¡¢²¹ÅÙÄ´Àá´ï
989 As we've seen, we now parse the thermostat commands correctly, and even flag
990 mistakes properly. But as you might have guessed by the weasely wording, the
991 program has no idea of what it should do, it does not get passed any of the
995 ¤³¤³¤Þ¤Ç¤Ç¡¢²¹ÅÙÄ´Àá´ï¤Î¥³¥Þ¥ó¥É¤òÀµ¤·¤¯¹½Ê¸²òÀϤ¹¤ë¤³¤È¤¬¤Ç¤¤ë¤è¤¦¤Ë
996 ¤Ê¤Ã¤¿¤À¤±¤Ç¤Ê¤¯¡¢¥¨¥é¡¼¤ÎÄÌÃνèÍý¤âŬÀڤ˹Ԥ¨¤ë¤è¤¦¤Ë¤Ê¤ê¤Þ¤·¤¿¡£¤·¤«
997 ¤·¡¢¡ÊTemperature set ¤È¤¤¤¦¤è¤¦¤Ê¡ËÛ£Ëæ¤Ê¸À¤¤²ó¤·¤«¤é¤âÁÛÁü¤¬¤Ä¤¯¤è¤¦
998 ¤Ë¡¢¥×¥í¥°¥é¥à¤Ï²¿¤ò¤¹¤Ù¤¤«Íý²ò¤·¤Æ¤ª¤é¤º¡¢¥æ¡¼¥¶¤«¤éÆþÎϤµ¤ì¤¿Ãͤâ¼õ
1002 Let's start by adding the ability to read the new target temperature. In
1003 order to do so, we need to learn the NUMBER match in the Lexer to convert
1004 itself into an integer value, which can then be read in YACC.
1007 ¿·µ¬¤ÎÀßÄê²¹ÅÙÃͤòÆɤ߹þ¤àµ¡Ç½¤òÄɲ䷤Ƥߤޤ·¤ç¤¦¡£¤³¤ì¤ò¤¹¤ë¤¿¤á¤Ë¤Ï¡¢
1008 »ú¶ç²òÀÏ´ï¤Ç¤É¤Î¤è¤¦¤Ë NUMBER ¤ËÂФ¹¤ë¥Þ¥Ã¥Á¤¬¤Ê¤µ¤ì¤Æ¡¢YACC ¤ÇÆɤá¤ë
1009 ¤è¤¦¤ÊÀ°¿ôÃͤËÊÑ´¹¤µ¤ì¤ë¤Î¤«¤òÃΤëɬÍפ¬¤¢¤ê¤Þ¤¹¡£
1012 Whenever Lex matches a target, it puts the text of the match in the
1013 character string 'yytext'. YACC in turn expects to find a value in the
1014 variable 'yylval'. In Example 5, we see the obvious solution:
1017 Lex ¤Ç¤Ï¡¢¥¿¡¼¥²¥Ã¥È¤Ë¥Þ¥Ã¥Á¤¹¤ë¤â¤Î¤¬¤¢¤Ã¤¿»þ¡¢'yytext' ¤È¤¤¤¦Ê¸»úÎó
1018 ¤Ë¥Þ¥Ã¥Á¤·¤¿¥Æ¥¥¹¥È¤ò³ÊǼ¤·¤Þ¤¹¡£°ìÊýYACC ¤Ç¤Ï¡¢¿ôÃͤΥޥåÁ¤Ï'
1019 yylval' ÊÑ¿ô¤ÎÃͤòÆɤळ¤È¤ÇÆÀ¤é¤ì¤Þ¤¹¡£Example 5 ¤Ï¤½¤Î¼ÂÁõ¤Ç¤¹¡£
1025 #include &dquot;y.tab.h&dquot;
1028 [0-9]+ yylval=atoi(yytext); return NUMBER;
1029 heat return TOKHEAT;
1030 on|off yylval=!strcmp(yytext,&dquot;on&dquot;); return STATE;
1031 target return TOKTARGET;
1032 temperature return TOKTEMPERATURE;
1033 \n /* ignore end of line */;
1034 [ \t]+ /* ignore whitespace */;
1043 #include &dquot;y.tab.h&dquot;
1046 [0-9]+ yylval=atoi(yytext); return NUMBER;
1047 heat return TOKHEAT;
1048 on|off yylval=!strcmp(yytext,&dquot;on&dquot;); return STATE;
1049 target return TOKTARGET;
1050 temperature return TOKTEMPERATURE;
1051 \n /* ²þ¹Ô¤Ï̵»ë */;
1052 [ \t]+ /* ¥Û¥ï¥¤¥È¥¹¥Ú¡¼¥¹¤Ï̵»ë */;
1058 As you can see, we run atoi() on yytext, and put the result in yylval, where
1059 YACC can see it. We do much the same for the STATE match, where we compare
1060 it to 'on', and set yylval to 1 if it is equal. Please note that having a
1061 separate 'on' and 'off' match in Lex would produce faster code, but I wanted
1062 to show a more complicated rule and action for a change.
1065 ¤´Í÷¤ÎÄ̤ꡢyytext ¤ò°ú¿ô¤È¤·¤Æ atoi() ¤ò¼Â¹Ô¤·¡¢·ë²Ì¤ò YACC ¤¬Íý²ò¤Ç
1066 ¤¤ë yylval ¤Ë³ÊǼ¤·¤Æ¤¤¤Þ¤¹¡£STATE ¤Ë¤Ä¤¤¤Æ¤â¤Û¤È¤ó¤ÉƱÍͤνèÍý¤ò¹Ô¤Ã
1067 ¤Æ¤ª¤ê¡¢'on' ¤Ë¥Þ¥Ã¥Á¤¹¤ëʸ»úÎ󤬤¢¤ì¤Ð yylval ¤Ë 1 ¤ò³ÊǼ¤·¤Æ¤¤¤Þ¤¹¡£
1068 Lex ¤Ç¤Ï 'on' ¤È 'off' ¤Î¤è¤¦¤Ê¥Þ¥Ã¥Á¤ÏÊÌ¡¹¤Ë¤¹¤ë¤È¡¢¹â®¤Ê¥³¡¼¥É¤¬À¸
1069 À®¤µ¤ì¤ë¤È¤¤¤¦¤³¤È¤Ï³Ð¤¨¤Æ¤ª¤¤¤Æ¤¯¤À¤µ¤¤¡£¤³¤³¤Ç¤Ï¤Á¤ç¤Ã¤ÈÊ£»¨¤Êµ¬Â§¤È
1070 Æ°ºî¤ò¤ª¸«¤»¤·¤¿¤¯¤Æ¡¢°ì½ï¤Ë¤·¤Æ¤¤¤Þ¤¹¡£
1073 Now we need to learn YACC how to deal with this. What is called 'yylval' in
1074 Lex has a different name in YACC. Let's examine the rule setting the new
1078 ¤µ¤Æ¡¢¤³¤ì¤Ë¤Ï YACC ¤Ç¤Ï¤É¤¦Âбþ¤¹¤ì¤ÐÎɤ¤¤Î¤Ç¤·¤ç¤¦¤«¡£Lex ¤Î 'yylval'
1079 ¤Ï YACC ¤Ç¤ÏÊ̤Î̾Á°¤Ç»²¾È¤µ¤ì¤Þ¤¹¡£¿·µ¬¤ÎÀßÄê²¹ÅÙÃͤòµ½Ò¤¹
1080 ¤ëµ¬Â§¤ò¸«¤Æ¤ß¤Þ¤·¤ç¤¦¡£
1084 TOKTARGET TOKTEMPERATURE NUMBER
1086 printf(&dquot;\tTemperature set to %d\n&dquot;,$3);
1092 To access the value of the third part of the rule (ie, NUMBER), we need to
1093 use $3. Whenever yylex() returns, the contents of yylval are attached to the
1094 terminal, the value of which can be accessed with the $-construct.
1097 ¥³¥Þ¥ó¥ÉÄêµÁ¤Î»°ÈÖÌÜ (¨¤Á¡¢NUMBER) ¤ÎÃͤ˥¢¥¯¥»¥¹¤¹¤ë¤Ë¤Ï¡¢$3 ¤ò»È¤¤
1098 ¤Þ¤¹¡£yylval ¤ÎÃͤϡ¢yylex() ¤«¤éÌá¤Ã¤Æ¤¯¤ëÅ٤˥Хåե¡¤ÎºÇ¸åÈø¤ËÄɲÃ
1099 ¤µ¤ì¤Æ¹Ô¤¡¢$ ¥³¥ó¥¹¥È¥é¥¯¥È¤Ç¥¢¥¯¥»¥¹¤Ç¤¤ë¤è¤¦¤Ë¤Ê¤ê¤Þ¤¹¡£
1102 To expound on this further, let's observe the new 'heat_switch' rule:
1105 ¤â¤¦¾¯¤·¾Ü¤·¤¯ÀâÌÀ¤¹¤ë¤¿¤á¤Ë¡¢¿·¤·¤¤ 'heat_switch' ¤Îµ¬Â§¤ò¸«¤Æ¤ß¤Þ
1113 printf(&dquot;\tHeat turned on\n&dquot;);
1115 printf(&dquot;\tHeat turned off\n&dquot;);
1121 If you now run example5, it properly outputs what you entered.
1124 example5 ¤ò»î¤·¤Æ¤ß¤Æ¤¯¤À¤µ¤¤¡£ÆþÎϤ¬Å¬ÀڤʷÁ¤Ç½ÐÎϤµ¤ì¤ë¤Ï¤º¤Ç¤¹¡£
1128 Parsing a configuration file
1131 ÀßÄê¥Õ¥¡¥¤¥ë¤Î¹½Ê¸²òÀÏ
1135 Let's repeat part of the configuration file we mentioned earlier:
1138 °ÊÁ°¤Ë¿¨¤ì¤¿ÀßÄê¥Õ¥¡¥¤¥ë¤Î°ìÉô¤ò¡¢¤â¤¦°ìÅÙ¸«¤Æ¤ß¤Þ¤·¤ç¤¦¡£
1141 zone &dquot;.&dquot; {
1143 file &dquot;/etc/bind/db.root&dquot;;
1148 Remember that we already wrote a Lexer for this file. Now all we need to do
1149 is write the YACC grammar, and modify the Lexer so it returns values in
1150 a format YACC can understand.
1153 ¤³¤Î¥Õ¥¡¥¤¥ëÍѤλú¶ç²òÀÏ´ï¤Ï´û¤Ëºî¤ê¤Þ¤·¤¿¡£¤¢¤È¤Ï YACC ¤Îʸˡ¥Õ¥¡¥¤¥ë
1154 ¤òºî¤ê¡¢»ú¶ç²òÀÏ´ï¤ÎÌá¤êÃͤò YACC ¤¬Íý²ò¤Ç¤¤ë¤è¤¦¤Ê·Á¼°¤Ë½¤Àµ¤¹¤ë¤À¤±
1158 In the lexer from Example 6 we see:
1161 Example 6 ¤Î »ú¶ç²òÀϴ狼¤é°Ê²¼¤Î¤³¤È¤¬¤ï¤«¤ê¤Þ¤¹¡£
1167 #include &dquot;y.tab.h&dquot;
1172 zone return ZONETOK;
1173 file return FILETOK;
1174 [a-zA-Z][a-zA-Z0-9]* yylval=strdup(yytext); return WORD;
1175 [a-zA-Z0-9\/.-]+ yylval=strdup(yytext); return FILENAME;
1176 \&dquot; return QUOTE;
1177 \{ return OBRACE;
1178 \} return EBRACE;
1180 \n /* ignore EOL */;
1181 [ \t]+ /* ignore whitespace */;
1190 #include &dquot;y.tab.h&dquot;
1195 zone return ZONETOK;
1196 file return FILETOK;
1197 [a-zA-Z][a-zA-Z0-9]* yylval=strdup(yytext); return WORD;
1198 [a-zA-Z0-9\/.-]+ yylval=strdup(yytext); return FILENAME;
1199 \&dquot; return QUOTE;
1200 \{ return OBRACE;
1201 \} return EBRACE;
1203 \n /* EOL¤ò̵»ë */;
1204 [ \t]+ /* ¥Û¥ï¥¤¥È¥¹¥Ú¡¼¥¹¤ò̵»ë */;
1210 If you look carefully, you can see that yylval has changed! We no longer
1211 expect it to be an integer, but in fact assume that it is a char *. In the
1212 interest of keeping things simple, we invoke strdup and waste a lot of
1213 memory. Please note that this may not be a problem in many areas where you
1214 only need to parse a file once, and then exit.
1217 Ãí°Õ¿¼¤¯¸«¤Æ¤ß¤ë¤È¡¢yylval ¤¬°ã¤¦¤³¤È¤Ëµ¤¤Å¤¤¤¿¤Ç¤·¤ç¤¦! À°¿ôÃͤǤ¢¤ë
1218 ¤³¤È¤¹¤é´üÂÔ¤·¤Æ¤¤¤Þ¤»¤ó¤·¡¢¼ÂºÝ char * ¤Ç¤¢¤ë¤È²¾Äꤷ¤Æ¤¤¤Þ¤¹¡£ÌäÂê¤ò
1219 ´Êñ¤Ë¤¹¤ë¤¿¤á¤Ë¡¢¥á¥â¥ê¤òϲÈñ¤¹¤ë¤Î¤â¹½¤ï¤º strdup ¤ò¼Â¹Ô¤·¤Æ¤ß¤Þ¤¹¡£
1220 ¤Ò¤È¤Ä¤Î¥Õ¥¡¥¤¥ë¤ò°ìÅÙ¤À¤±¥Ñ¡¼¥¹¤·¤Æ½ªÎ»¡¢¤È¤¤¤¦¤è¤¦¤Ê°ìÈÌŪ¤ÊÍÑÅӤˤª
1221 ¤¤¤Æ¤Ï¡¢¤³¤ì¤ÇÌäÂê¤Ê¤¤¤È¤¤¤¦¤³¤È¤ò³Ð¤¨¤Æ¤ª¤¤¤Æ¤¯¤À¤µ¤¤¡£
1224 We want to store character strings because we are now mostly dealing with
1225 names: file names and zone names. In a later chapter we will explain how to
1226 deal with multiple types of data.
1229 ¤³¤³¤Ç¤Ï¥Õ¥¡¥¤¥ë̾¤ä¥¾¡¼¥ó̾¤Î¤è¤¦¤Ê̾Á°¤òºÇ¤âÉÑÈˤ˰·¤¦¤Î¤Ç¡¢¤½¤ì¤é¤ò
1230 ʸ»úÎó¤È¤·¤Æ³ÊǼ¤·¤¿¤¤¤È¤·¤Þ¤¹¡£¥Ç¡¼¥¿¤ÎÊ£¿ô¤Î·¿¤Î°·¤¤Êý¤Ë¤Ä¤¤¤Æ¤Ï¸å½Ò
1234 In order to tell YACC about the new type of yylval, we add this line to the
1235 header of our YACC grammar:
1238 YACC ¤Ë¿·¤·¤¤·¿¤Î yylval ¤ò¶µ¤¨¤Æ¤ä¤ë¤Ë¤Ï¡¢YACC ¤Îʸˡ¥Õ¥¡¥¤¥ë¤ÎÀèƬ¤Ë
1241 <verb>#define YYSTYPE char *</verb>
1244 The grammar itself is again more complicated. We chop it in parts to make it
1248 ʸˡ¼«ÂΤϹ¹¤ËÊ£»¨¤Ê¤â¤Î¤Ë¤Ê¤Ã¤Æ¤¤¤Þ¤¹¡£Íý²ò¤·¤ä¤¹¤¤¤è¤¦¤Ëʬ³ä¤·¤Æ¤ß¤Þ
1254 commands command SEMICOLON
1263 ZONETOK quotedname zonecontent
1265 printf(&dquot;Complete zone for '%s' found\n&dquot;,$2);
1272 This is the intro, including the aforementioned recursive 'root'. Please
1273 note that we specify that commands are terminated (and separated) by ;'s. We
1274 define one kind of command, the 'zone_set'. It consists of the ZONE token
1275 (the word 'zone'), followed by a quoted name and the 'zonecontent'. This
1276 zonecontent starts out simple enough:
1279 ¤³¤ì¤Ï¡¢¾å½Ò¤ÎºÆµ¢Åª¤Ê 'root' ¤ò´Þ¤àƳÆþÉôʬ¤Ç¤¹¡£¥³¥Þ¥ó¥É¤¬ ; ¤Ç½ªÃ¼
1280 ¤µ¤ì¤Æ¡Ê¤½¤·¤Æ¶èÀÚ¤é¤ì¤Æ¡Ë¤¤¤ë¤³¤È¤Ëα°Õ¤·¤Æ¤¯¤À¤µ¤¤¡£¤³¤³¤Ç¤Ï 'zone_set'
1281 ¤È¤¤¤¦¡¢¥³¥Þ¥ó¥É¤Î¤ßÄêµÁ¤·¤Þ¤¹¡£¤³¤Î¥³¥Þ¥ó¥É¤Ï ZONE ¥È¡¼¥¯
1282 ¥ó¡Ê 'zone' ¤È¤¤¤¦Ã±¸ì¡Ë¤È¡¢¤½¤ì¤Ë³¤¯°úÍÑÉä¤Ç³ç¤é¤ì¤¿Ì¾Á°¡¢¤½¤ì¤Ë 'zonecontent'
1283 ¤«¤éÀ®¤ê¤Þ¤¹¡£¤Þ¤º¤Ï¤È¤Ã¤«¤«¤ê°×¤¤ zonecontent ¤Ç¤¹¤¬ -
1287 OBRACE zonestatements EBRACE
1292 It needs to start with an OBRACE, a {. Then follow the zonestatements,
1293 followed by an EBRACE, }.
1296 ¤³¤ì¤Ï { ¤Çɽ¤µ¤ì¤ë OBRACE ¤Ç»Ï¤Þ¤ê¤Þ¤¹¡£¤½¤ì¤«¤é zonestatements¡¢¤½¤·
1297 ¤Æ } ¤Çɽ¤µ¤ì¤ë EBRACE ¤È³¤¤Þ¤¹¡£
1301 QUOTE FILENAME QUOTE
1308 This section defines what a 'quotedname' is: a FILENAME between QUOTEs.
1309 Then it says something special: the value of a quotedname token is the value
1310 of the FILENAME. This means that the quotedname has as its value the
1311 filename without quotes.
1314 ¤³¤Î¥»¥¯¥·¥ç¥ó¤Ï 'quotedname' ¤òÄêµÁ¤·¤Æ¤¤¤Þ¤¹¡£QUOTE ¤Ë¶´¤Þ¤ì¤¿
1315 FILENAME ¤È¤¤¤¦°ÕÌ£¤Ç¤¹¤¬¡¢¤Á¤ç¤Ã¤ÈÆüì¤Ê¤Î¤Ï¡¢quotedname ¤È¤¤¤¦¥È¡¼¥¯
1316 ¥ó¤ÎÃͤ¬ FILENAME ¤ÎÃͤËÅù¤·¤¤¤È¤¤¤¦¤³¤È¤Ç¤¹¡£¤Ä¤Þ¤ê¡¢quotedname ¤Ï¥Õ¥¡
1317 ¥¤¥ë̾¤«¤é°úÍÑÉä¤ò½ü¤¤¤¿¤â¤Î¤Ç¤¢¤ë¤È¤¤¤¦°ÕÌ£¤Ç¤¹¡£
1320 This is what the magic '$$=$2;' command does. It says: my value is the value
1321 of my second part. When the quotedname is now referenced in other rules, and
1322 you access its value with the $-construct, you see the value that we set
1326 ¤³¤ì¤ÏËâË¡¤Î '$$=$2' ¥³¥Þ¥ó¥É¤¬¤ä¤Ã¤Æ¤¯¤ì¤ë¤³¤È¤Ç¡¢¼«¿È¤ÎÃͤϼ«¿È¤ÎÆó
1327 ÈÖÌܤÎÉô°Ì¤ÎÃͤǤ¢¤ë¤È¤¤¤¦¤³¤È¤ò»Ø¤·¤Þ¤¹¡£Â¾¤Îʸˡµ¬Â§¤Ç¤â»²¾È¤µ¤ì¤Æ
1328 ¤¤¤ë quotedname ¤Ë $ ¥³¥ó¥¹¥È¥é¥¯¥È¤Ç¥¢¥¯¥»¥¹¤¹¤ë¤È¡¢¤³¤³¤Ç $$=$2 ¤È¤·
1329 ¤ÆÀßÄꤷ¤¿Ãͤ¬ÆÀ¤é¤ì¤Þ¤¹¡£
1333 NOTE: this grammar chokes on filenames without either a '.' or a '/'
1339 Ãí°Õ - ¤³¤Îʸˡ¤Ç¤Ï¡¢¥¾¡¼¥ó¥Õ¥¡¥¤¥ë̾¤Ë '.' ¤« '/' ¤«¤¬´Þ¤Þ¤ì¤Æ¤¤¤Ê¤¤¤È
1346 zonestatements zonestatement SEMICOLON
1354 printf(&dquot;A zonefile name '%s' was encountered\n&dquot;, $2);
1361 This is a generic statement that catches all kinds of statements within
1362 the 'zone' block. We again see the recursiveness.
1365 ¤³¤ì¤Ï¡¢'zone' ¥Ö¥í¥Ã¥¯Æâ¤Î¤¢¤é¤æ¤ë¼ïÎà¤Îʸ¤ËÂбþ¤Ç¤¤ë¤è¤¦¤Ë°ìÈ̲½¤·
1366 ¤¿Ê¸¤Ç¤¹¡£¤³¤³¤Ç¤âºÆµ¢À¤¬Ç§¤á¤é¤ì¤Þ¤¹¡£
1370 OBRACE zonestatements EBRACE SEMICOLON
1374 | statements statement
1377 statement: WORD | block | quotedname
1381 This defines a block, and 'statements' which may be found within.
1384 ¤³¤ì¤Ï¥Ö¥í¥Ã¥¯¤È¡¢'ʸ' ¤ÎÃæ¤Ë½Ð¸½¤¹¤ë 'ʸ' ¤òÄêµÁ¤·¤Æ¤¤¤Þ¤¹¡£
1387 When executed, the output is like this:
1390 ¼Â¹Ô¤µ¤ì¤ë¤È¡¢½ÐÎϤϰʲ¼¤Î¤è¤¦¤Ë¤Ê¤ê¤Þ¤¹¡£
1394 zone &dquot;.&dquot; {
1396 file &dquot;/etc/bind/db.root&dquot;;
1399 A zonefile name '/etc/bind/db.root' was encountered
1400 Complete zone for '.' found
1405 Making a Parser in C++
1408 C++ ¤Ç¤Î¹½Ê¸²òÀÏ´ï¤ÎºîÀ®
1413 Although Lex and YACC predate C++, it is possible to generate a C++ parser.
1414 While Flex includes an option to generate a C++ lexer, we won't be using
1415 that, as YACC doesn't know how to deal with it directly.
1418 Lex ¤È YACC ¤Ï C++ ¤¬Åо줹¤ë°ÊÁ°¤«¤é¤¢¤ê¤Þ¤¹¤¬¡¢C++ ¤Ç¹½Ê¸²òÀÏ´ï¤òºî
1419 À®¤¹¤ë¤³¤È¤â²Äǽ¤Ç¤¹¡£Flex ¤Ë¤Ï C++ ¤Î »ú¶ç²òÀÏ´ï¤òÀ¸À®¤¹¤ë¥ª¥×¥·¥ç¥ó
1420 ¤â¤¢¤ê¤Þ¤¹¤¬¡¢YACC ¤ÎÊý¤Ë¤½¤ì¤òľÀÜ°·¤¦ÊýË¡¤¬¤Ê¤¤¤Î¤Ç¡¢¤³¤³¤Ç¤Ï»ÈÍѤ·
1424 My preferred way to make a C++ parser is to have Lex generate a plain C
1425 file, and to let YACC generate C++ code. When you then link your
1426 application, you may run into some problems because the C++ code by default
1427 won't be able to find C functions, unless you've told it that those
1428 functions are extern &dquot;C&dquot;.
1431 É®¼Ô¤¬ C++ ¤Î¹½Ê¸²òÀÏ´ï¤òºî¤ëºÝ¤Ë¹¥¤ó¤Ç»È¤¦ÊýË¡¤Ï¡¢Lex ¤ËÉáÄ̤ΠC ¥Õ¥¡
1432 ¥¤¥ë¤Î»ú¶ç²òÀÏ´ï¤ò½ÐÎϤµ¤»¤Æ¡¢YACC ¤Ë C++ ¤Î¹½Ê¸²òÀÏ´ï¤òÀ¸À®¤µ¤»¤ë¤ä¤ê
1433 Êý¤Ç¤¹¡£¥¢¥×¥ê¥±¡¼¥·¥ç¥ó¤ò¥ê¥ó¥¯¤·¤¿»þ¤Ë¡¢¤³¤ÎÊýË¡¤À¤ÈÌäÂ꤬¤¢¤ë¾ì¹ç
1434 ¤¬¤¢¤ê¤Þ¤¹¡£C++ ¤Ç¤ÏÌÀ¼¨Åª¤Ë extern &dquot;C&dquot; Àë¸À¤·¤Ê¤¤¸Â¤ê¡¢¥Ç
1435 ¥Õ¥©¥ë¥È¤Ç¤Ï C ¤Î´Ø¿ô¤ò¸«¤Ä¤±¤é¤ì¤Ê¤¤¤«¤é¤Ç¤¹¡£
1438 To do so, make a C header in YACC like this:
1441 ¤³¤ì¤ò²óÈò¤¹¤ë¤¿¤á¤Ë¤Ï¡¢YACC ¤Ç°Ê²¼¤Î¤è¤¦¤Ê C ¥Ø¥Ã¥À¤òºî¤Ã¤Æ¤¯¤À¤µ¤¤¡£
1444 extern &dquot;C&dquot;
1457 If you want to declare or change yydebug, you must now do it like this:
1460 yydebug ¤òÀë¸À¤â¤·¤¯¤Ï¡¢Êѹ¹¤·¤¿¤¤¾ì¹ç¤Ï¤³¤³¤Ç°Ê²¼¤Î¤è¤¦¤Ë¹Ô¤Ã¤Æ¤¯¤À¤µ
1474 This is because C++'s One Definition Rule, which disallows multiple
1475 definitions of yydebug.
1478 ¤³¤ì¤Ï¡¢C++ ¤Î 'ÄêµÁ¤Ï°ìÅÙ' µ¬Â§¤Î¤¿¤á¤ËɬÍפǡ¢yydebug ¤Î¿½ÅÄêµÁ¤òËÉ
1482 You may also find that you need to repeat the #define of YYSTYPE in your Lex
1483 file, because of C++'s stricter type checking.
1486 ²Ã¤¨¤Æ¡¢C++ ¤Ç¤Ï·¿¥Á¥§¥Ã¥¯¤¬¤è¤ê¸·¤·¤¤¤Î¤Ç¡¢YYSTYPE ¤Î #define ¤ò Lex
1487 ¥Õ¥¡¥¤¥ë¤ËÄɵ¤·¤Ê¤¤¤È¤À¤á¤«¤â¤·¤ì¤Þ¤»¤ó¡£
1490 To compile, do something like this:
1493 ¥³¥ó¥Ñ¥¤¥ë¤¹¤ë¤Ë¤Ï¡¢°Ê²¼¤Î¤è¤¦¤Ë¤·¤Æ¤¯¤À¤µ¤¤¡£
1497 yacc --verbose --debug -d bindconfig2.y -o bindconfig2.cc
1498 cc -c lex.yy.c -o lex.yy.o
1499 c++ lex.yy.o bindconfig2.cc -o bindconfig2
1503 Because of the -o statement, y.tab.h is now called bindconfig2.cc.h, so take
1507 -o »ØÄ꤬¤¢¤ë¤Î¤Ç¡¢y.tab.h ¤Ï bindconfig2.cc.h ¤È¤¤¤¦Ì¾Á°¤Ë¤Ê¤Ã¤Æ¤¤¤ë
1508 ¤³¤È¤Ë¤âα°Õ¤·¤Æ¤¯¤À¤µ¤¤¡£
1511 To summarize: don't bother to compile your Lexer in C++, keep it in C. Make
1512 your Parser in C++ and explain your compiler that some functions are C
1513 functions with extern &dquot;C&dquot; statements.
1516 ¤Þ¤È¤á - »ú¶ç²òÀÏ´ï¤Î¥³¥ó¥Ñ¥¤¥ë¤Ï¡¢¤ï¤¶¤ï¤¶ C++ ¤Ç¤ä¤í¤¦¤È¤·¤Ê¤¤¤Ç C
1517 ¤Ç¤ä¤ë¤³¤È¡£¹½Ê¸²òÀÏ´ï¤Ï C++ ¤Çºî¤ê¡¢C ´Ø¿ô¤ò¸Æ¤Ó½Ð¤·¤¿¤¤»þ¤Ï extern
1518 &dquot;C&dquot; ʸ¤Ç¥³¥ó¥Ñ¥¤¥é¤ËÀë¸À¤¹¤ë¤³¤È¡£
1523 How do Lex and YACC work internally
1526 Lex ¤È YACC ¤ÎÆâÉôÆ°ºî
1530 In the YACC file, you write your own main() function, which calls yyparse()
1531 at one point. The function yyparse() is created for you by YACC, and ends up
1535 ¾å½Ò¤Î YACC ¥Õ¥¡¥¤¥ë¤Ç¤Ï¡¢yyparse() ¤ò¸Æ¤Ó½Ð¤¹ main() ´Ø¿ô¤ò¼«ºî¤·¤Þ¤·
1536 ¤¿¡£yyparse() ´Ø¿ô¤Ï YACC ¤¬¼«Æ°À¸À®¤·¤Æ¤¯¤ì¡¢y.tab.c ¤È¤¤¤¦¥Õ¥¡¥¤¥ë¤Ë
1540 yyparse() reads a stream of token/value pairs from yylex(), which needs to
1541 be supplied. You can code this function yourself, or have Lex do it for you.
1542 In our examples, we've chosen to leave this task to Lex.
1545 yyparse() ¤Ï¡¢Ï¢Â³¤·¤ÆÆþÎϤµ¤ì¤ë¤Ù¤¥È¡¼¥¯¥ó¤È¤½¤ÎÃͤÎÁȤò¡¢yylex() ¤«
1546 ¤éÆɤ߹þ¤ß¤Þ¤¹¡£¤³¤Î´Ø¿ô¤ÏÆɼԤ¬¼«ºî¤µ¤ì¤Æ¤â¹½¤¤¤Þ¤»¤ó¤¬¡¢Lex ¤Ëºî¤é¤»
1547 ¤ë¤³¤È¤â¤Ç¤¤Þ¤¹¡£¤³¤³¤Ç¤Ï¡¢Lex ¤Ë¤ä¤é¤»¤ë¤³¤È¤Ë¤·¤Þ¤¹¡£
1550 The yylex() as written by Lex reads characters from a FILE * file pointer
1551 called yyin. If you do not set yyin, it defaults to standard input. It
1552 outputs to yyout, which if unset defaults to stdout. You can also modify
1553 yyin in the yywrap() function which is called at the end of a file. It
1554 allows you to open another file, and continue parsing.
1557 Lex ¤¬ºîÀ®¤·¤¿ yylex() ¤Ï¡¢yyin ¤È¤¤¤¦ FILE * ·¿¥Õ¥¡¥¤¥ë¥Ý¥¤¥ó¥¿¤«¤éʸ
1558 »úÎó¤òÆɤ߹þ¤ß¤Þ¤¹¡£yyin ¤Ï¥»¥Ã¥È¤·¤Ê¤¤¸Â¤ê¡¢¥Ç¥Õ¥©¥ë¥È¤Ç¤Ïɸ½àÆþÎϤË
1559 ¤Ê¤ê¤Þ¤¹¡£½ÐÎÏ¤Ï yyout ¤Ë¤Ê¤ê¡¢¤³¤ì¤â¥»¥Ã¥È¤µ¤ì¤Æ¤¤¤Ê¤±¤ì¤Ð¡¢¥Ç¥Õ¥©¥ë
1560 ¥È¤Çɸ½à½ÐÎϤȤʤê¤Þ¤¹¡£¤Þ¤¿¡¢¥Õ¥¡¥¤¥ë¤ÎºÇ¸å¤Ç¸Æ¤Ð¤ì¤ë yywrap() ´Ø¿ôÆâ
1561 ¤Î yyin ¤âÊѹ¹¤Ç¤¡¢Ê̤Υե¡¥¤¥ë¤ò¥ª¡¼¥×¥ó¤·¤Æ¥Ñ¡¼¥¹¤·Â³¤±¤ë¤è¤¦¤Ë¤¹¤ë
1565 If this is the case, have it return 0. If you want to end parsing at this
1566 file, let it return 1.
1569 ¤³¤Î¾ì¹ç¤Ï¡¢0 ¤òÌá¤êÃͤȤ·¤ÆÊÖ¤¹¤è¤¦¤Ë¤·¤Æ¤¯¤À¤µ¤¤¡£¥Ñ¡¼¥¹¤ò½ªÎ»¤µ¤»¤¿
1570 ¤¤»þ¤Ï¡¢1 ¤òÊÖ¤¹¤è¤¦¤Ë¤·¤Æ¤¯¤À¤µ¤¤¡£
1573 Each call to yylex() returns an integer value which represents a token type.
1574 This tells YACC what kind of token it has read. The token may optionally
1575 have a value, which should be placed in the variable yylval.
1578 yylex() ¤Ï¸Æ¤Ð¤ì¤ëÅ٤ˡ¢¥È¡¼¥¯¥ó¼ïÊ̤òɽ¤¹À°¿ôÃͤòÊÖ¤·¤Þ¤¹¡£¤³¤ì¤Ï
1579 YACC ¤¬¡¢¤¤¤Þ¤Þ¤Ç¤Ë¤É¤ó¤Ê¥È¡¼¥¯¥ó¤òÆɤ߹þ¤ó¤À¤«¸«Ê¬¤±¤ë¤Î¤Ë»È¤ï¤ì¤Þ¤¹¡£
1580 ¥È¡¼¥¯¥ó¤Ï¿ï°Õ¡¢Ãͤò»ý¤Ä¾ì¹ç¤¬¤¢¤ê¡¢¤½¤Î¾ì¹ç¤Ï yylval ¤ËÃͤ¬³ÊǼ¤µ¤ì¤Þ
1584 By default yylval is of type int, but you can override that from the YACC
1585 file by re#defining YYSTYPE.
1588 ¥Ç¥Õ¥©¥ë¥È¤Ç¤Ï¡¢yylval ¤Ï int ·¿¤Ç¤¹¤¬¡¢YACC ¥Õ¥¡¥¤¥ë¤ÇºÆÅÙ YYSTYPE ¤ò
1589 #define ¤¹¤ë¤³¤È¤Ç¡¢¥ª¡¼¥Ð¡¼¥é¥¤¥É¤¹¤ë¤³¤È¤¬¤Ç¤¤Þ¤¹¡£
1592 The Lexer needs to be able to access yylval. In order to do so, it must be
1593 declared in the scope of the lexer as an extern variable. The original YACC
1594 neglects to do this for you, so you should add the following to your lexter,
1595 just beneath #include <y.tab.h>:
1598 »ú¶ç²òÀÏ´ï¤Ï¡¢yylval ¤Ë¥¢¥¯¥»¥¹¤Ç¤¤ëɬÍפ¬¤¢¤ê¤Þ¤¹¡£¤½¤Î¤¿¤á¤Ë¤Ï¡¢
1599 yylval ¤¬»ú¶ç²òÀÏ´ï¤Î¥¹¥³¡¼¥×¤ËÂФ·¤Æ¡¢³°Éô»²¾ÈÊÑ¿ô¤È¤·¤ÆÀë¸À¤µ¤ì¤Æ¤¤
1600 ¤ëɬÍפ¬¤¢¤ê¤Þ¤¹¡£¥ª¥ê¥¸¥Ê¥ë¤Î YACC ¤Ï¡¢¤³¤ì¤ò¼«Æ°Åª¤Ë¤·¤Æ¤¯¤ì¤Ê¤¤¤Î¤Ç¡¢
1601 °Ê²¼¤ò»ú¶ç²òÀÏ´ï¤Î #include <y.tab.h> ľ¸å¤Ë¡¢µ½Ò¤¹¤ëɬÍפ¬¤¢¤ê
1605 extern YYSTYPE yylval;
1609 Bison, which most people are using these days, does this for you
1613 ¶áǯ¹¤¯»È¤ï¤ì¤Æ¤¤¤ë Bison ¤Ç¤Ï¡¢¤³¤ì¤ò¼«Æ°Åª¤Ë¤ä¤Ã¤Æ¤¯¤ì¤Þ¤¹¡£
1623 As mentioned before, yylex() needs to return what kind of token it
1624 encountered, and put its value in yylval. When these tokens are defined with
1625 the %token command, they are assigned numerical id's, starting from 256.
1628 ¾å½Ò¤·¤¿¤è¤¦¤Ë¡¢yylex() ¤Ï½Ð¸½¤·¤¿¥È¡¼¥¯¥ó¼ïÊ̤òÊÖ¤¹É¬Íפ¬¤¢¤ê¡¢Ãͤò
1629 yylval ¤Ë³ÊǼ¤¹¤ëɬÍפ¬¤¢¤ê¤Þ¤¹¡£¤³¤ì¤é¤Î¥È¡¼¥¯¥ó¤¬¡¢%token ¥³
1630 ¥Þ¥ó¥É¤ÇÄêµÁ¤µ¤ì¤Æ¤¤¤ë¾ì¹ç¡¢³Æ¡¹¤Ë¤Ï 256 ¤«¤é»Ï¤Þ¤ë¿ô»ú¤Î id ¤¬¿¶¤é¤ì
1634 Because of that fact, it is possible to have all ascii characters as a
1635 token. Let's say you are writing a calculator, up till now we would have
1636 written the lexer like this:
1639 ¤³¤Î¤³¤È¤«¤é¡¢Á´¥¢¥¹¥¡¼Ê¸»ú¤ò¥È¡¼¥¯¥ó¤È¤¹¤ë¤³¤È¤â²Äǽ¤Ç¤¹¡£Î㤨¤Ð¡¢ÅÅ
1640 Âî¤òºî¤ë¾ì¹ç¤Ê¤É¡¢¤³¤ì¤Þ¤Ç¤Î·Ð¸³¤òÀ¸¤«¤¹¤È¡¢°Ê²¼¤Î¤è¤¦¤Ë»ú¶ç²òÀÏ´ï¤ò½ñ
1641 ¤±¤ë¤³¤È¤Ë¤Ê¤ë¤Ç¤·¤ç¤¦¤«¡£
1644 [0-9]+ yylval=atoi(yytext); return NUMBER;
1645 [ \n]+ /* eat whitespace */;
1647 \* return MULT;
1648 \+ return PLUS;
1653 Our YACC grammer would then contain:
1656 YACC ʸˡ¤Ë¤Ï¡¢°Ê²¼¤ò´Þ¤à¤³¤È¤Ë¤Ê¤ê¤Þ¤¹¡§
1669 This is needlessly complicated. By using characters as shorthands for
1670 numerical token id's, we can rewrite our lexer like this:
1673 ¤³¤ì¤Ï̵Â̤ËÊ£»¨¤Ê¤À¤±¤Ç¤¹¡£¿ôÃͤòɽ¤¹¥È¡¼¥¯¥ó id ¤ò¡¢Ê¸»úÎó¤ò»È¤Ã¤Æ´Ê
1674 άɽµ¤¹¤ë¤È¡¢»ú¶ç²òÀÏ´ï¤Ï°Ê²¼¤Î¤è¤¦¤Ë½ñ¤Ä¾¤»¤Þ¤¹:
1677 [0-9]+ yylval=atoi(yytext); return NUMBER;
1678 [ \n]+ /* eat whitespace */;
1679 . return (int) yytext[0];
1683 This last dot matches all single otherwise unmatched characters.
1686 ºÇ¸å¤Î¥É¥Ã¥È¤Ï¡¢¥Þ¥Ã¥Á¤·¤Ê¤«¤Ã¤¿Ê¸»úÁ´¤Æ¤òɽ¤·¤Þ¤¹¡£
1689 Our YACC grammer would then be:
1705 This is lots shorter and also more obvious. You do not need to declare these
1706 ascii tokens with %token in the header, they work out of the box.
1709 ¿ïʬ¤ï¤«¤ê¤ä¤¹¤¯¡¢¤Þ¤¿Ã»¤¯¤Ê¤ê¤Þ¤·¤¿¡£¥¢¥¹¥¡¼Ê¸»ú¤òɽ¤¹¥È¡¼¥¯¥ó¤ò¥Ø¥Ã
1710 ¥À¤Î %token ¤ÇÀë¸À¤¹¤ëɬÍפâ¤Ê¤¯¡¢¤½¤Î¤Þ¤Þ¤Ç»È¤¨¤Æ¤¤¤Þ¤¹¡£
1713 One other very good thing about this construct is that Lex will now match
1714 everything we throw at it - avoiding the default behaviour of echoing
1715 unmatched input to standard output. If a user of this calculator uses a ^,
1716 for example, it will now generate a parsing error, instead of being echoed
1720 ¤³¤Î¥³¥ó¥¹¥È¥é¥¯¥È¤Î¤â¤¦¤Ò¤È¤ÄÍ¥¤ì¤¿¤È¤³¤í¤Ï¡¢ÆþÎϤ·¤¿¤â¤Î¤Ï¤Ê¤ó¤Ç¤â
1721 Lex ¤¬¥Þ¥Ã¥Á¤ò¤È¤Ã¤Æ¤¯¤ì¤ë¤è¤¦¤Ë¤Ê¤Ã¤¿¡¢¤È¤¤¤¦¤³¤È¤Ç¤¹ - ¤³¤¦¤¹¤ë¤³¤È
1722 ¤Ç¡¢¥Þ¥Ã¥Á¤·¤Ê¤¤ÆþÎϤòɸ½à½ÐÎϤØÅǤ½Ð¤¹¤È¤¤¤¦¡¢¥Ç¥Õ¥©¥ë¥ÈÆ°ºî¤ò²óÈò¤·
1723 ¤Æ¤¤¤Þ¤¹¡£Î㤨¤Ð¡¢ÅÅÂî¤Ë¥æ¡¼¥¶¤¬ ^ ¤òÆþÎϤ¹¤ë¤È¡¢É¸½à½ÐÎϤؤ½¤Î¤Þ¤Þɽ
1724 ¼¨¤µ¤ì¤ëÂå¤ï¤ê¤Ë¹½Ê¸²òÀÏ¥¨¥é¡¼¤ò½ÐÎϤ¹¤ë¤è¤¦¤Ë¤Ê¤ê¤Þ¤¹¡£
1729 Recursion: 'right is wrong'
1732 ºÆµ¢ - 'Á±(right=±¦¡Ë¤Ï°'
1737 Recursion is a vital aspect of YACC. Without it, you can't specify that a
1738 file consists of a sequence of independent commands or statements. Out of
1739 its own accord, YACC is only interested in the first rule, or the one you
1740 designate as the starting rule, with the '%start' symbol.
1743 ºÆµ¢¤Ï¡¢YACC ¤Ç¤Ï¤¤ï¤á¤Æ½ÅÍפǤ¹¡£¤³¤ì¤Ê¤¯¤·¤Æ¤Ï¡¢ÆÈΩ¤¹¤ë¥³¥Þ¥ó¥É¤ä
1744 ʸ¤ÎϢ³¤«¤é¥Õ¥¡¥¤¥ë¤¬¹½À®¤µ¤ì¤Æ¤¤¤ë¡¢¤È¤¤¤¦¤³¤È¤¬¸À¤¨¤Ê¤¯¤Ê¤ê¤Þ¤¹¡£¤Ä
1745 ¤Þ¤ê¡¢YACC ¤Ë¤È¤Ã¤ÆºÇ¤â½ÅÍפʤΤϰìÈÖÌܤε¬Â§¡¢Â¨¤Á¡¢'%start'
1746 ¥·¥ó¥Ü¥ë¤Ç»ØÄꤷ¤¿¡¢µ¯ÅÀ¤ò¼¨¤¹µ¬Â§¤Î¤ß¤È¤¤¤¦¤³¤È¤Ë¤Ê¤ê¤Þ¤¹¡£
1749 Recursion in YACC comes in two flavours: right and left. Left recursion,
1750 which is the one you should use most of the time, looks like this:
1753 YACC ¤Ë¤ª¤±¤ëºÆµ¢¤Ë¤Ï¡¢2¤Ä¤Î¥¿¥¤¥× - ±¦¤Èº¸ - ¤¬¤¢¤ê¤Þ¤¹¡£º¸ºÆµ¢¤Ï¤Û¤È
1754 ¤ó¤É¤Î¾ì¹ç¤Ë»È¤¦¤Ù¤¤â¤Î¤Ç¡¢°Ê²¼¤Î¤è¤¦¤Ê¤â¤Î¤Ç¤¹¡£
1757 commands: /* empty */
1763 This says: a command is either empty, or it consists of more commands,
1764 followed by a command. They way YACC works means that it can now easily chop
1765 off individual command groups (from the front) and reduce them.
1768 ¤³¤ì¤Ï¡¢¥³¥Þ¥ó¥É¤¬¶õ¤Ç¤¢¤ë¡¢¤â¤·¤¯¤ÏÊ£¿ô¤Î¥³¥Þ¥ó¥É¤Î¸å¤Ë¡¢¤¢¤ë¥³¥Þ¥ó¥É
1769 ¤¬Â³¤¯¤È¤¤¤¦°ÕÌ£¤Ç¤¹¡£YACC ¤ÎÆ°ºî¤«¤é¤¹¤ë¤È¡¢¤³¤ì¤Ï¡ÊÁ°Êý¤«¤é¡Ë¸Ä¡¹¤Î
1770 ¥³¥Þ¥ó¥É·²¤ò´Êñ¤ËÀÚ¤êʬ¤±¤Æ¡¢´Ô¸µ¤Ç¤¤ë¤È¤¤¤¦¤³¤È¤ò°ÕÌ£¤·¤Þ¤¹¡£
1773 Compare this to right recursion, which confusingly enough looks better to
1777 ¤³¤ì¤ò±¦ºÆµ¢¤ÈÈæ¤Ù¤Æ¤ß¤ë¤È¡¢Ê¶¤é¤ï¤·¤¤¤Ç¤¹¤¬¸«¤¿ÌܤÏÎɤ¯¤Ê¤ê¤Þ¤¹¡£
1780 commands: /* empty */
1786 But this is expensive. If used as the %start rule, it requires YACC to keep
1787 all commands in your file on the stack, which may take a lot of memory. So
1788 by all means, use left recursion when parsing long statements, like entire
1789 files. Sometimes it is hard to avoid right recursion but if your statements
1790 are not too long, you do not need to go out of your way to use left
1794 ¤·¤«¤·¡¢¤³¤ì¤Ï½èÍý¤È¤·¤Æ¤Ï¹â¤¯¤Ä¤¤Þ¤¹¡£%start µ¬Â§¤È¤·¤Æ»È¤ï¤ì
1795 ¤¿¾ì¹ç¡¢YACC ¤Ï¥Õ¥¡¥¤¥ëÃæ¤ÎÁ´¥³¥Þ¥ó¥É¤ò¥¹¥¿¥Ã¥¯¤ËÊÝ»ý¤·¤Ê¤¯¤Æ¤Ï¤Ê¤é¤º¡¢
1796 ¥á¥â¥ê¤òÂçÎ̤˾ÃÈñ¤·¤Þ¤¹¡£¤³¤Î¤³¤È¤«¤é¡¢¥Õ¥¡¥¤¥ëÁ´Éô¤È¤¤¤¦¤è¤¦¤ÊĹʸ¤Î
1797 ¹½Ê¸²òÀϤò¤¹¤ëºÝ¤Ï¡¢º¸ºÆµ¢¤ò»ÈÍѤ·¤Æ¤¯¤À¤µ¤¤¡£»þ¤Ë¤Ï±¦ºÆµ¢¤Î»ÈÍѤ¬Èò¤±
1798 ¤é¤ì¤Ê¤¤¤è¤¦¤Ê¾õ¶·¤â¤¢¤ë¤«¤â¤·¤ì¤Þ¤»¤ó¤¬¡¢Ê¸¤¬¤¢¤Þ¤ê¤Ë¤âŤ¹¤®¤ë¾ì¹ç¤ò
1799 ½ü¤¤¤Æ¤Ï¡¢º¸ºÆµ¢°Ê³°¤ò»È¤¦É¬ÍפϤʤ¤¤Ç¤·¤ç¤¦¡£
1802 If you have something terminating (and therefore separating) your commands,
1803 right recursion looks very natural, but is still expensive:
1806 ¥³¥Þ¥ó¥É¤ò½ªÃ¼¤·¤Æ¡Ê¤æ¤¨¤Ë¶èÀڤäơˤ¤¤ë¤â¤Î¤¬¤¢¤ë¤è¤¦¤Ê¾ì¹ç¤Ë¤Ï¡¢±¦ºÆ
1807 µ¢¤ò»È¤¦¤È¼«Á³¤Ê´¶¤¸¤Ë¤Ê¤ê¤Þ¤¹¤¬¡¢½èÍý¤¬¹â¤¯¤Ä¤¯¤³¤È¤Ë¤Ï¤«¤ï¤ê¤¢¤ê¤Þ¤»
1811 commands: /* empty */
1813 command SEMICOLON commands
1817 The right way to code this is using left recursion (I didn't invent this
1821 ¤³¤Î¥³¡¼¥É¤Ï¡¢Àµ¤·¤¯¤Ïº¸ºÆµ¢¤ò»È¤Ã¤Æ½ñ¤¤Þ¤¹¡Ê¤³¤ì¤âÉ®¼Ô¤¬¤Ç¤Ã¤Á¤¢¤²¤¿
1822 Ìõ¤Ç¤Ï¤¢¤ê¤Þ¤»¤ó¡Ë¡£
1825 commands: /* empty */
1827 commands command SEMICOLON
1831 Earlier versions of this HOWTO mistakenly used right recursion. Markus
1832 Triska kindly informed us of this.
1835 ¤³¤Î HOWTO ¤Î°ÊÁ°¤Î¥Ð¡¼¥¸¥ç¥ó¤Ç¤â¡¢´Ö°ã¤¨¤Æ±¦ºÆµ¢¤ò»È¤Ã¤Æ¤¤¤Þ¤·¤¿¤¬¡¢
1836 Markus Triska ¤¬¿ÆÀڤˤâ»ØŦ¤·¤Æ¤¯¤ì¤Þ¤·¤¿¡£
1841 Advanced yylval: %union
1844 ¤è¤ê¹âÅÙ¤Ê yylval - %union
1849 Currently, we need to define *the* type of yylval. This however is not
1850 always appropriate. There will be times when we need to be able to handle
1851 multiple data types. Returning to our hypothetical thermostat, perhaps we
1852 want to be able to choose a heater to control, like this:
1855 ¤³¤³¤Þ¤Ç¤Ç¤Ï¡¢yylval ¤Î *·¿¤½¤Î¤â¤Î* ¤òÄêµÁ¤¹¤ëɬÍפ¬¤¢¤ê¤Þ¤·¤¿¡£¤·¤«
1856 ¤·¡¢¤³¤ì¤¬¤¤¤Ä¤âŬÅö¤Ç¤¢¤ë¤È¤Ï¸Â¤ê¤Þ¤»¤ó¡£Ê£¿ô¤Î¥Ç¡¼¥¿·¿¤ò°·¤¨¤Ê¤¯¤Æ¤Ï
1857 ¤Ê¤é¤Ê¤¤¤³¤È¤â¤¢¤ë¤«¤âÃΤì¤Ê¤¤¤«¤é¤Ç¤¹¡£²¾ÁÛ²¹ÅÙÄ´Àá´ï¤ÎÎã¤ËÌá¤Ã¤Æ¡¢À©
1858 ¸æ¤¹¤ë¤Ù¤¥Ò¡¼¥¿¡¼¤òÁª¤Ó¤¿¤¤¤È¤·¤¿¤é¡¢°Ê²¼¤Î¤è¤¦¤Ë¤Ê¤ê¤Þ¤¹¡£
1862 Selected 'mainbuilding' heater
1863 target temperature 23
1864 'mainbuilding' heater target temperature now 23
1868 What this calls for is for yylval to be a union, which can hold both strings
1869 and integers - but not simultaneously.
1872 ¥Ý¥¤¥ó¥È¤Ï yylval ¤¬¶¦ÍÑÂΤˤʤäơ¢Ê¸»úÎó¤ÈÀ°¿ô¤ÎξÊý¤òÊÝ»ý¤¹¤ë¤³¤È¤¬
1873 ¤Ç¤¤ë¡¢¤È¤¤¤¦¤³¤È¤Ç¤¹ - Ʊ»þ¤Ë°ì½ï¤Ç¤Ï¤¢¤ê¤Þ¤»¤ó¤¬¡£
1876 Remember that we told YACC previously what type yylval was supposed to by by
1877 defining YYSTYPE. We could conceivably define YYSTYPE to be a union this
1878 way, by YACC has an easier method for doing this: the %union statement.
1881 °ÊÁ°¡¢YACC ¤ËÂФ·¤Æ yylval ¤Î·¿¤ò¡¢YYSTYPE ¤È¤·¤ÆÄêµÁ¤·¤Æ¤¤¤¿¤Î¤ò»×¤¤
1882 ½Ð¤·¤Æ¤¯¤À¤µ¤¤¡£¤³¤ì¤Ï¡¢YACC ¤Î %union ʸ¤È¤¤¤¦¡¢¤è¤ê´Êñ¤ÊÊýË¡
1883 ¤Ç¡¢¶¦ÍÑÂΤȤ·¤ÆÄêµÁ¤¹¤ë¤³¤È¤â¤Ç¤¤¿¤Î¤Ç¤Ï¤Ê¤¤¤Ç¤·¤ç¤¦¤«¡£
1886 Based on Example 4, we now write the Example 7 YACC grammar. First the
1890 Example 4 ¤Ë´ð¤Å¤¤¤Æ¡¢Example 7 ¤Î YACC ʸˡ¤ò½ñ¤¤¤Æ¤ß¤Þ¤¹¡£¤Þ¤º¤Ï¤½¤Î
1894 %token TOKHEATER TOKHEAT TOKTARGET TOKTEMPERATURE
1902 %token <number> STATE
1903 %token <number> NUMBER
1904 %token <string> WORD
1908 We define our union, which contains only a number and a string. Then
1909 using an extended %token syntax, we explain to YACC which part
1910 of the union each token should access.
1913 ¿ô»ú¤Èʸ»úÎó¤Î¤ß¤ò´Þ¤à¡¢¶¦ÍÑÂΤòÄêµÁ¤·¤Þ¤¹¡£¤½¤ì¤«¤é³ÈÄ¥¤µ¤ì¤¿
1914 %token ¥·¥ó¥¿¥Ã¥¯¥¹¤Ç¡¢YACC ¤Ë¶¦ÍÑÂΤΤɤÎÉôʬ¤Ë¡¢¤½¤ì¤¾¤ì¤Î¥È¡¼
1915 ¥¯¥ó¤¬¥¢¥¯¥»¥¹¤¹¤Ù¤¤«»Ø¼¨¤·¤Æ¤¤¤Þ¤¹¡£
1918 In this case, we let the STATE token use an integer, as before. Same goes
1919 for the NUMBER token, which we use for reading temperatures.
1922 ¤³¤Î¾ì¹ç¡¢°ÊÁ°¤ä¤Ã¤¿¤è¤¦¤Ë STATE ¥È¡¼¥¯¥ó¤Ë int ·¿¤ò³ä¤êÅö¤Æ¤Þ¤¹¡£Æ±ÍÍ
1923 ¤Ë¡¢²¹ÅÙ¤òÆɤ߼è¤ë¤¿¤á¤Î NUMBER ¥È¡¼¥¯¥ó¤â³ä¤êÅö¤Æ¤Þ¤¹¡£
1926 New however is the WORD token, which is declared to need a string.
1929 ¿·¤·¤¤¤Î¤Ï WORD ¥È¡¼¥¯¥ó¤Ç¡¢Ê¸»úÎó¤Ç¤¢¤ë¤ÈÀë¸À¤µ¤ì¤Æ¤¤¤Þ¤¹¡£
1932 The Lexer file changes a bit too:
1935 »ú¶ç²òÀÏ¥×¥í¥°¥é¥à¤Î¥Õ¥¡¥¤¥ë¤â¡¢Â¿¾¯Êѹ¹¤¬¤¢¤ê¤Þ¤¹¡£
1942 #include &dquot;y.tab.h&dquot;
1945 [0-9]+ yylval.number=atoi(yytext); return NUMBER;
1946 heater return TOKHEATER;
1947 heat return TOKHEAT;
1948 on|off yylval.number=!strcmp(yytext,&dquot;on&dquot;); return STATE;
1949 target return TOKTARGET;
1950 temperature return TOKTEMPERATURE;
1951 [a-z0-9]+ yylval.string=strdup(yytext);return WORD;
1952 \n /* ignore end of line */;
1953 [ \t]+ /* ignore whitespace */;
1963 #include &dquot;y.tab.h&dquot;
1966 [0-9]+ yylval.number=atoi(yytext); return NUMBER;
1967 heater return TOKHEATER;
1968 heat return TOKHEAT;
1969 on|off yylval.number=!strcmp(yytext,&dquot;on&dquot;); return STATE;
1970 target return TOKTARGET;
1971 temperature return TOKTEMPERATURE;
1972 [a-z0-9]+ yylval.string=strdup(yytext);return WORD;
1973 \n /* ²þ¹Ô¤Ï̵»ë */;
1974 [ \t]+ /* ¥Û¥ï¥¤¥È¥¹¥Ú¡¼¥¹¤Ï̵»ë */;
1980 As you can see, we don't access the yylval directly anymore, we add a suffix
1981 indicating which part we want to access. We don't need to do that in the
1982 YACC grammar however, as YACC performs the magic for us:
1985 ¤ªµ¤¤Å¤¤Ë¤Ê¤Ã¤¿¤è¤¦¤Ë¡¢¤â¤¦ yylval ¤½¤Î¤â¤Î¤Ë¤ÏľÀÜ¥¢¥¯¥»¥¹¤·¤Æ¤ª¤é¤º¡¢
1986 ¥¢¥¯¥»¥¹¤·¤¿¤¤Éôʬ¤ò¼¨¤¹¤Î¤Ë¡¢¥µ¥Õ¥£¥Ã¥¯¥¹¤òÉղ䷤Ƥ¤¤Þ¤¹¡£YACC ¤Ë¤Ï
1987 °Ê²¼¤Î¤è¤¦¤ÊËâË¡¤¬¤¢¤ë¤Î¤Ç¡¢YACC ʸˡ¤Ç¤Ï¤³¤ì¤ÏÉÔÍפǤ¹¡£
1993 printf(&dquot;\tSelected heater '%s'\n&dquot;,$2);
2000 Because of the %token declaration above, YACC automatically picks
2001 the 'string' member from our union. Note also that we store a copy of $2,
2002 which is later used to tell the user which heater he is sending commands to:
2005 Á°½Ò¤Î %token Àë¸À¤Î¤ª¤«¤²¤Ç¡¢YACC ¤Ï¼«Æ°Åª¤Ë¶¦ÍÑÂΤ«¤é 'ʸ»úÎó'
2006 ¥á¥ó¥Ð¤òÆɤ߼è¤Ã¤Æ¤¯¤ì¤Æ¤¤¤Þ¤¹¡£¤Þ¤¿¡¢¤³¤³¤Ç¤Ï¸å¤Ç¡¢¥³¥Þ¥ó¥É¤ÎÁ÷¤êÀè¤Ë
2007 ¤Ê¤Ã¤Æ¤¤¤ë¥Ò¡¼¥¿¡¼¤ò¥æ¡¼¥¶¤ËÄÌÃΤ¹¤ë¤Î¤Ë»È¤ï¤ì¤ë¡¢$2 ¤Î¥³¥Ô¡¼¤â³ÊǼ¤·
2012 TOKTARGET TOKTEMPERATURE NUMBER
2014 printf(&dquot;\tHeater '%s' temperature set to %d\n&dquot;,heater,$3);
2020 For more details, read example7.y.
2023 ¾ÜºÙ¤Ï example7.y ¤ò»²¾È¤¯¤À¤µ¤¤¡£
2035 Especially when learning, it is important to have debugging facilities.
2036 Luckily, YACC can give a lot of feedback. This feedback comes at the cost of
2037 some overhead, so you need to supply some switches to enable it.
2040 ¥×¥í¥°¥é¥à¤ò¡¢Æ°¤«¤·¤Ê¤¬¤é³Ø¤Ö¤è¤¦¤Ê»þ¤ÏÆäˡ¢¥Ç¥Ð¥Ã¥°µ¡Ç½¤¬¤¢¤ë¤³¤È
2041 ¤¬½ÅÍפˤʤê¤Þ¤¹¡£¹¬±¿¤Ë¤âYACC¤Ï¡¢Â¿¿ô¤Î¥Õ¥£¡¼¥É¥Ð¥Ã¥¯¤òÊÖ¤¹µ¡Ç½¤ò»ý¤Ã
2042 ¤Æ¤¤¤Þ¤¹¡£¤³¤Îµ¡Ç½¤Ï¤¤¤¯¤é¤«¥ª¡¼¥Ð¡¼¥Ø¥Ã¥É¤òɬÍפȤ¹¤ë¤Î¤Ç¡¢»ÈÍѤˤ¢¤¿¤Ã
2043 ¤Æ¤Ï´ö¤Ä¤«¤Î¥¹¥¤¥Ã¥Á¤ò¥¤¥Í¡¼¥Ö¥ë¤Ë¤¹¤ëɬÍפ¬¤¢¤ê¤Þ¤¹¡£
2046 When compiling your grammar, add --debug and
2047 --verbose to the YACC commandline. In your grammar C
2048 heading, add the following:
2051 ʸˡ¤ò¥³¥ó¥Ñ¥¤¥ë¤¹¤ë»þ¤Ï¡¢YACC ¤Î¥³¥Þ¥ó¥É¥é¥¤¥ó¤Ë¡¢--debug
2052 ¤ä --verbose ¤ò¤Ä¤±¤Þ¤¹¡£Ê¸Ë¡¥Õ¥¡¥¤¥ë¤Î C ¥Ø¥Ã¥ÀÉô¤Ë¤Ï¡¢°Ê²¼
2058 This will generate the file 'y.output' which explains the state machine that
2062 ¤³¤ì¤Ï 'y.output' ¤È¤¤¤¦¡¢½ÐÎϤµ¤ì¤¿¥¹¥Æ¡¼¥È¥Þ¥·¥ó¤òÀâÌÀ¤¹¤ë¥Õ¥¡¥¤¥ë¤ò
2066 When you now run the generated binary, it will output a *lot* of what is
2067 happening. This includes what state the state machine currently has, and
2068 what tokens are being read.
2071 À¸À®¤µ¤ì¤¿¥Ð¥¤¥Ê¥ê¤ò¼Â¹Ô¤¹¤ë¤È¡¢¤³¤Î¥Õ¥¡¥¤¥ë¤Ï¡¢¸½ºßµ¯¤³¤Ã¤Æ¤¤¤ë¤³¤È¤Ë
2072 ¤Ä¤¤¤Æ¡¢*Èó¾ï¤Ë¤¿¤¯¤µ¤ó¤Î* ¾ðÊó¤ò½ÐÎϤ·¤Æ¤¯¤ì¤Þ¤¹¡£¤³¤ì¤Ë¤Ï¡¢¥¹¥Æ¡¼¥È
2073 ¥Þ¥·¥ó¤¬¤É¤ó¤Ê¥¹¥Æ¡¼¥È¤òÊÝͤ·¤Æ¤¤¤ë¤Î¤«¡¢¤É¤ó¤Ê¥È¡¼¥¯¥ó¤¬Æɤ߹þ¤Þ¤ì¤Æ
2074 ¤¤¤ë¤Î¤«Åù¤Î¾ðÊó¤â´Þ¤Þ¤ì¤Þ¤¹¡£
2077 Peter Jinks wrote a page on <URL
2078 URL=&dquot;http://www.cs.man.ac.uk/~pjj/cs2121/debug.html&dquot; name=&dquot;debugging&dquot;> which
2079 contains some common errors and how to solve them.
2083 URL="http://www.cs.man.ac.uk/~pjj/cs2121/debug.html" name="debugging">
2084 ¤Ë¤è¤¯¤¢¤ë¥¨¥é¡¼¤ä¡¢¥¨¥é¡¼¤Ø¤ÎÂнè¤Î»ÅÊý¤Ê¤É¤òºÜ¤»¤Æ¤¤¤Þ¤¹¡£
2097 Internally, your YACC parser runs a so called 'state machine'. As the name
2098 implies, this is a machine that can be in several states. Then there are
2099 rules which govern transitions from one state to another. Everything starts
2100 with the so called 'root' rule I mentioned earlier.
2103 YACC ¤ÇÀ¸À®¤·¤¿¹½Ê¸²òÀÏ´ï¤Ï¡¢ÆâÉôŪ¤Ë¤Ï '¥¹¥Æ¡¼¥È¥Þ¥·¥ó' ¤È¤¤¤¦¤â¤Î¤ò
2104 ¼Â¹Ô¤·¤Æ¤¤¤Þ¤¹¡£Ì¾Á°¤¬¼¨¤¹¤è¤¦¤Ë¡¢¤³¤ì¤Ï¡¢¤¤¤¯¤Ä¤«¤Î¥¹¥Æ¡¼¥È¡Ê¾õÂ֡ˤò
2105 ¤È¤êÆÀ¤ë¥Þ¥·¥ó¡Êµ¡³£¡Ë¤Î¤³¤È¤Ç¤¹¡£¤É¤Î¥¹¥Æ¡¼¥È¤«¤é¤É¤Î¥¹¥Æ¡¼¥È¤ØÁ«°Ü¤¹
2106 ¤ë¤«¤ò·èÄꤹ¤ëµ¬Â§¡¢¤È¤¤¤¦¤Î¤â¸ºß¤·¤Þ¤¹¡£Á´¤Æ¤ÏÉ®¼Ô¤¬Á°½Ò¤·¤¿¡¢¤¤¤ï¤æ
2107 ¤ë 'root' µ¬Â§¤¬µ¯ÅÀ¤È¤Ê¤ê¤Þ¤¹¡£
2110 To quote from the output from the Example 7 y.output:
2113 Example 7 ¤Î y.output ¤«¤é¤Î½ÐÎϤò°úÍѤ¹¤ë¤È -
2118 ZONETOK , and go to state 1
2120 $default reduce using rule 1 (commands)
2122 commands go to state 29
2123 command go to state 2
2124 zone_set go to state 3
2128 By default, this state reduces using the 'commands' rule. This is the
2129 aforementioned recursive rule that defines 'commands' to be built up from
2130 individual command statements, followed by a semicolon, followed by possibly
2134 ¤³¤Î¥¹¥Æ¡¼¥È¤Ï¡¢¥Ç¥Õ¥©¥ë¥È¤Ç¤Ï 'commands' µ¬Â§¤òÍѤ¤¤Æ´Ô¸µ¤·¤Þ¤¹¡£¤³¤ì
2135 ¤Ï¡¢Á°½Ò¤·¤¿ºÆµ¢¤Ë´Ø¤¹¤ëµ¬Â§¤Ç¤¢¤ê¡¢¸Ä¡¹¤Î¥³¥Þ¥ó¥Éʸ¡¢¥»¥ß¥³¥í¥ó¡¢¹¹¤Ë
2136 ³¤¯¥³¥Þ¥ó¥É¤«¤é¹½À®¤µ¤ì¤ë 'commands' ¤òÄêµÁ¤·¤Æ¤¤¤Þ¤¹¡£
2139 This state reduces until it hits something it understands, in this case, a
2140 ZONETOK, ie, the word 'zone'. It then goes to state 1, which deals further
2141 with a zone command:
2144 ¤³¤Î¥¹¥Æ¡¼¥È¤Ï¡¢²ò¼á¤Ç¤¤ë¥È¡¼¥¯¥ó - ¤³¤Î¾ì¹ç ZONETOK ¨¤Á 'zone' ¤È¤¤
2145 ¤¦Ã±¸ì - ¤ËÅþ㤹¤ë¤Þ¤Ç´Ô¸µ¤·¤Þ¤¹¡£¤½¤ì¤«¤é¡¢¥¹¥Æ¡¼¥È 1 ¤ØÁ«°Ü¤·¡¢zone
2146 ¥³¥Þ¥ó¥É¤ò¤µ¤é¤Ë¾Ü¤·¤¯½èÍý¤·¤Þ¤¹¡£
2151 zone_set -> ZONETOK . quotedname zonecontent (rule 4)
2153 QUOTE , and go to state 4
2155 quotedname go to state 5
2159 The first line has a '.' in it to indicate where we are: we've just seen a
2160 ZONETOK and are now looking for a 'quotedname'. Apparently, a quotedname
2161 starts with a QUOTE, which sends us to state 4.
2164 ºÇ½é¤Î¹Ô¤Ï¸½ºß¤Î¾ì½ê¤ò¼¨¤¹ '.' ¤ò´Þ¤ó¤Ç¤¤¤Þ¤¹ - ZONETOK ¤Ï´û¤Ë¸«¤Ä¤«¤Ã
2165 ¤¿¤Î¤Ç¡¢¼¡¤Ë 'quotedname' ¤òõ¤·¤Æ¤¤¤Þ¤¹¡£quotedname ¤Ï QUOTE ¤Ç»Ï¤Þ¤ë
2166 ¤Î¤Ç¡¢¥¹¥Æ¡¼¥È 4 ¤ËÁ«°Ü¤¹¤ë¤³¤È¤Ë¤Ê¤ê¤Þ¤¹¡£
2169 To follow this further, compile Example 7 with the flags mentioned in the
2173 °Ê¾å¤Ë¤Ä¤¤¤Æ¤â¤¦¾¯¤··¡¤ê²¼¤²¤¿¤¤Êý¤Ï¡¢Example 7 ¤ò¥Ç¥Ð¥Ã¥°¤Î¾Ï¤Ç¿¨¤ì¤¿
2174 ¥Õ¥é¥°¤òÉÕ¤±¤Æ¥³¥ó¥Ñ¥¤¥ë¤·¤Æ¤ß¤Æ¤¯¤À¤µ¤¤¡£
2179 Conflicts: 'shift/reduce', 'reduce/reduce'
2182 ¥³¥ó¥Õ¥ê¥¯¥È: 'shift/reduce', 'reduce/reduce'
2186 Whenever YACC warns you about conflicts, you may be in for trouble. Solving
2187 these conflicts appears to be somewhat of an art form that may teach you a
2188 lot about your language. More than you possibly would have wanted to know.
2191 YACC ¤¬¥³¥ó¥Õ¥ê¥¯¥È¤Ë¤Ä¤¤¤Æ·Ù¹ð¤ò½Ð¤¹»þ¤Ï¡¢²¿¤«ÌäÂ꤬¤¢¤ë»þ¤Ç¤·¤ç¤¦¡£
2192 ¥³¥ó¥Õ¥ê¥¯¥È¤ò²ò·è¤¹¤ëºî¶È¤Ë¤Ï¡¢¤È¤¤Ë¿¦¿Í·Ý¤Î¤è¤¦¤ÊÆü줵¤¬¤¢¤ê¡¢¤¢¤Ê
2193 ¤¿¤Î»ÈÍѤ·¤Æ¤¤¤ë¸À¸ì¤Ë¤Ä¤¤¤Æ¿¤¯¤Î¤³¤È¤ò¶µ¤¨¤Æ¤¯¤ì¤ë¤³¤È¤Ç¤·¤ç¤¦ - ¤½
2194 ¤ì¤â¤¢¤Ê¤¿¤¬ÃΤꤿ¤«¤Ã¤¿¤³¤È°Ê¾å¤Î¤³¤È¤ò¡£
2197 The problems revolve around how to interpret a sequence of tokens. Let's
2198 suppose we define a language that needs to accept both these commands:
2201 ÌäÂê¤Ï¡¢Ï¢Â³¤¹¤ë¥È¡¼¥¯¥ó¤ò¤É¤¦²ò¼á¤¹¤ë¤«¡¢¤È¤¤¤¦ÅÀ¤òÃæ¿´¤Ëµ¯¤³¤ê¤Þ¤¹¡£
2202 °Ê²¼¤ÎÆó¤Ä¤Î¥³¥Þ¥ó¥É¤ò¼õ¤±ÉÕ¤±¤ëɬÍפ¬¤¢¤ë¸À¸ì¤ò¡¢ÄêµÁ¤¹¤ë¤È¤·¤Þ¤·¤ç¤¦¡£
2206 delete heater number1
2210 To do this, we define this grammar:
2213 ¤³¤ì¤ò¤¹¤ë¤Ë¤Ï¡¢°Ê²¼¤ÎʸˡÄêµÁ¤¬É¬ÍפǤ¹¡£
2217 TOKDELETE TOKHEATER mode
2225 TOKDELETE TOKHEATER WORD
2232 You may already be smelling trouble. The state machine starts by reading the
2233 word 'delete', and then needs to decide where to go based on the next token.
2234 This next token can either be a mode, specifying how to delete the heaters,
2235 or the name of a heater to delete.
2238 ¤â¤¦ÌäÂê¤Î½¤¤¤¬¤·¤Æ¤¤Þ¤·¤¿¤Í¡£¥¹¥Æ¡¼¥È¥Þ¥·¥ó¤Ï 'delete' ¤È¤¤¤¦Ã±¸ì¤ò
2239 Æɤळ¤È¤«¤é»Ï¤á¤Æ¡¢¼¡¤Î¥È¡¼¥¯¥ó¤¬²¿¤Ç¤¢¤ë¤«¤Ë¤è¤Ã¤Æ¡¢Á«°ÜÀè¤ò·èÄꤹ¤ë
2240 ɬÍפ¬¤¢¤ê¤Þ¤¹¡£¼¡¤Î¥È¡¼¥¯¥ó¤È¤¤¤¦¤Î¤Ï¡¢¥Ò¡¼¥¿¡¼¤Îºï½üÊýË¡¤ò»ØÄꤹ¤ë¥â¡¼
2241 ¥É¡¢¤â¤·¤¯¤Ïºï½ü¤¹¤Ù¤¥Ò¡¼¥¿¡¼¤Î̾Á°¤Ç¤¹¡£
2244 The problem however is that for both commands, the next token is going to be
2245 a WORD. YACC has therefore no idea what to do. This leads to
2246 a 'reduce/reduce' warning, and a further warning that the 'delete_a_heater'
2247 node is never going to be reached.
2250 ÌäÂê¤ÏξÊý¤Î¥³¥Þ¥ó¥É¤Ë¤È¤Ã¤Æ¡¢¼¡¤Î¥È¡¼¥¯¥ó¤¬ WORD ¤Ë¤Ê¤ë¤È¤¤¤¦¤³¤È¤Ç¤¹¡£
2251 YACC ¤Ï¡¢¤³¤Î¾ì¹ç¤É¤¦¤·¤ÆÎɤ¤¤«¤ï¤«¤ê¤Þ¤»¤ó¡£¤³¤ì¤¬ 'reduce/reduce'¡¢
2252 ¹¹¤Ë¤Ï 'delete_a_heater' ¥Î¡¼¥É¤Ë·è¤·¤Æ㤹¤ë¤³¤È¤¬¤Ê¤¤¡¢¤È¤¤¤¦·Ù¹ð¤Ë
2256 In this case the conflict is resolved easily (ie, by renaming the first
2257 command to 'delete heaters all', or by making 'all' a separate token), but
2258 sometimes it is harder. The y.output file generated when you pass yacc the
2259 &dquot;--verbose flag can be of tremendous help.
2262 ¤³¤Î¾ì¹ç¤Î¥³¥ó¥Õ¥ê¥¯¥È¤Ï¡¢´Êñ¤Ë²ò·è¤Ç¤¤Þ¤¹¤¬¡ÊºÇ½é¤Î¥³¥Þ¥ó¥É̾¤ò'
2263 delete heaters all' ¤ËÊѹ¹¤·¤¿¤ê¡¢'all' ¤òÆÈΩ¤·¤¿¥È¡¼¥¯¥ó¤È¤·¤ÆÄêµÁ¤¹
2264 ¤ë¤Ê¤É¡Ë¡¢¤â¤Ã¤ÈÊ£»¨¤Ë¤Ê¤ë¤³¤È¤â¤¢¤ê¤Þ¤¹¡£--verbose ¥Õ¥é¥°¤ò
2265 Éղä·¤Æyacc¤òÄ̤¹¤ÈÀ¸À®¤µ¤ì¤ë y.output ¥Õ¥¡¥¤¥ë¤Ï¡¢¤½¤ó¤Ê»þ¤ËÈó¾ï¤Ë½õ
2277 GNU YACC (Bison) comes with a very nice info-file (.info) which documents
2278 the YACC syntax very well. It mentions Lex only once, but otherwise it's
2279 very good. You can read .info files with Emacs or with the very nice
2280 tool 'pinfo'. It is also available on the GNU site:
2281 <URL name="BISON Manual" URL="http://www.gnu.org/manual/bison/">.
2284 GNU YACC (Bison) ¤Ë¤Ï¡¢YACC ¤Î¥·¥ó¥¿¥Ã¥¯¥¹¤ò¾Ü¤·¤¯µ½Ò¤·¤¿¤¹¤Ð¤é¤·¤¤
2285 info ¥Õ¥¡¥¤¥ë (.info) ¤¬ÉÕ°¤·¤Æ¤¤Þ¤¹¡£Lex ¤Ï°ìÅÙ¤·¤«¿¨¤ì¤é¤ì¤Æ¤¤¤Þ¤»
2286 ¤ó¤¬¡¢¤½¤ÎÅÀ¤ò½ü¤±¤ÐÍ¥¤ì¤Æ¤¤¤Þ¤¹¡£.info ¥Õ¥¡¥¤¥ë¤Ï Emacs ¤ä 'pinfo' ¤È
2287 ¤¤¤Ã¤¿¡¢»È¤¤¾¡¼ê¤ÎÎɤ¤¥Ä¡¼¥ë¤ÇÆɤळ¤È¤¬¤Ç¤¤Þ¤¹¡£°Ê²¼¤Î GNU ¥µ¥¤¥È¤Ç
2288 ¤âÆþ¼ê²Äǽ¤Ç¤¹¡§<URL name="BISON Manual"
2289 URL="http://www.gnu.org/manual/bison/">.
2292 Flex comes with a good manpage which is very useful if you already
2293 have a rough understanding of what Flex does. The
2294 <URL name="Flex Manual" URL="http://www.gnu.org/manual/flex/"> is also
2298 Flex ¤Ë¤ÏÍ¥¤ì¤¿ man ¥Ú¡¼¥¸¤¬ÉÕ°¤·¤Æ¤¤Þ¤¹¡£Flex ¤Ç²¿¤¬¤Ç¤¤ë¤«¡¢¤¶¤Ã
2299 ¤È¤Ç¤âÍý²ò¤·¤Æ¤¤¤ë¿Í¤Ë¤Ï¡¢¤È¤Æ¤âÍÍѤǤ·¤ç¤¦¡£<URL name="Flex Manual"
2300 URL="http://www.gnu.org/manual/flex/"> ¤â¥ª¥ó¥é¥¤¥ó¤ÇÆþ¼ê²Äǽ¤Ç¤¹¡£
2303 After this introduction to Lex and YACC, you may find that you need more
2304 information. I haven't read any of these books yet, but they sound good:
2307 ¤³¤Î Lex ¤È YACC ¤Î¥¤¥ó¥È¥í¥À¥¯¥·¥ç¥ó¤ò½ª¤¨¤Æ¡¢¤â¤¦¾¯¤·¾ðÊó¤¬Íߤ·¤¤¤È
2308 »×¤ï¤ì¤¿Êý¤â¤¤¤ë¤Ç¤·¤ç¤¦¡£°Ê²¼¤ÎËܤϡ¢É®¼Ô¤ÏÁ´¤¯Æɤó¤Ç¤¤¤Þ¤»¤ó¤¬¡¢¥¿¥¤
2309 ¥È¥ë¤Ï¤¤¤¤´¶¤¸¤Ç¤¹ -
2312 <tag>Bison-The Yacc-Compatible Parser Generator</tag>
2315 By Charles Donnelly and Richard Stallman. An <URL name="Amazon"
2316 url="http://www.amazon.com/exec/obidos/ASIN/0595100325/qid=989165194/sr=1-2/ref=sc_b_3/002-7737249-1404015">user found it useful.
2319 Charles Donnelly and Richard Stallman ¤Ë¤è¤ë¤â¤Î¤Ç¤¹¡£¤³¤ÎËܤòµ¤¤ËÆþ¤Ã
2320 ¤¿<URL name="Amazon"
2321 url="http://www.amazon.com/exec/obidos/ASIN/0595100325/qid=989165194/sr=1-2/ref=sc_b_3/002-7737249-1404015">
2322 ¥æ¡¼¥¶¤â¤¤¤é¤Ã¤·¤ã¤ë¤è¤¦¤Ç¤¹¡£
2324 <tag>Lex & Yacc</tag>
2327 By John R. Levine, Tony Mason and Doug Brown.
2328 Considered to be the standard
2329 work on this subject, although a bit dated. Reviews over at <URL name="Amazon" URL="http://www.amazon.com/exec/obidos/ASIN/1565920007/ref=sim_books/002-7737249-1404015">.
2332 John R. Levine, Tony Mason and Doug Brown ¤Ë¤è¤ë¤â¤Î¤Ç¤¹¡£ ¤Á¤ç¤Ã¤È¸Å
2333 ¤¤¤Ç¤¹¤¬¡¢¤³¤Î¥Æ¡¼¥Þ¤Ë´Ø¤·¤Æ¤Ï¶µ²Ê½ñŪ¸ºß¤Ç¤¹¡£<URL name="Amazon"
2334 URL="http://www.amazon.com/exec/obidos/ASIN/1565920007/ref=sim_books/002-7737249-1404015">
2335 ¤Ë¥ì¥Ó¥å¡¼¤¬¤¢¤ê¤Þ¤¹¡£
2337 <tag>Compilers : Principles, Techniques, and Tools</tag>
2340 By Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman. The 'Dragon Book'.
2341 From 1985 and they just keep printing it. Considered the standard work on
2342 constructing compilers.
2345 Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman ¤Ë¤è¤ë¤â¤Î¤Ç¤¹¡£Ä̾Π'¥É
2346 ¥é¥´¥ó¥Ö¥Ã¥¯'¡£1985ǯ¤Ë½Ð¤¿¤È¤¤¤¦¤Î¤Ë¡¢¤¨¤ó¤¨¤ó¤ÈÁýºþ¤µ¤ì¤Æ¤¤¤Þ¤¹¡£¥³
2347 ¥ó¥Ñ¥¤¥é³«È¯¤Ë´Ø¤·¤Æ¤Ï¶µ²Ê½ñŪ¸ºß¤Ç¤¹¡£<URL Name="Amazon"
2348 URL="http://www.amazon.com/exec/obidos/ASIN/0201100886/ref=sim_books/002-7737249-1404015">
2354 Thomas Niemann wrote a document discussing how to write compilers and
2355 calculators with Lex & YACC. You can find it <URL
2356 URL="http://epaperpress.com/y_man.html" name="here">.
2359 Thomas Niemann ¤Ï Lex ¤È YACC ¤ò»È¤Ã¤¿¥³¥ó¥Ñ¥¤¥é¤ÈÅÅÂî¤Îºî¤êÊý¤Ë¤Ä¤¤¤Æ
2360 ¥É¥¥å¥á¥ó¥È¤ò½ñ¤¤¤Æ¤ª¤ê¡¢<URL URL="http://epaperpress.com/y_man.html"
2361 name="¤³¤³">¤Ë¤¢¤ê¤Þ¤¹¡£
2364 The moderated usenet newsgroup comp.compilers can also be very useful but
2365 please keep in mind that the people there are not a dedicated parser
2366 helpdesk! Before posting, read their interesting <URL name="page"
2367 URL="http://compilers.iecc.com/"> and especially the <URL name="FAQ"
2368 URL="http://compilers.iecc.com/faq.txt">.
2371 comp.compilers ¤È¤¤¤¦¥Ë¥å¡¼¥¹¥°¥ë¡¼¥×¤â usenet ¤Ë¤¢¤ê¡¢¤Ê¤«¤Ê¤«ÍøÍѲÁ
2372 Ãͤ¬¤¢¤ê¤Þ¤¹¡£¤Ç¤¹¤¬¡¢»²²Ã¤·¤Æ¤¤¤ë¿Í¤¿¤Á¤Ï¹½Ê¸²òÀÏ´ï¤ÎÀ찥إë¥×¥Ç¥¹¥¯
2373 Í×°÷¤Ç¤Ï¤¢¤ê¤Þ¤»¤ó! Åê¹Æ¤¹¤ëÁ°¤Ë¡¢Èà¤é¤Î<URL name="¥Ú¡¼¥¸"
2374 URL="http://compilers.iecc.com/">¤Ï¶½Ì£¿¼¤¤¤Î¤Ç¸«¤ë¤³¤È¡¢Æä˼ÁÌä¤Ï
2375 <URL name="FAQ" URL="http://compilers.iecc.com/faq.txt">¤Ë¤Á¤ã¤ó¤ÈÌܤò
2376 Ä̤·¤¿¾å¤ÇÅꤲ¤ë¤³¤È¡£
2379 Lex - A Lexical Analyzer Generator by M. E. Lesk and E. Schmidt is one of
2380 the original reference papers. It can be found
2381 <url NAME="here" url="http://www.cs.utexas.edu/users/novak/lexpaper.htm">.
2384 Lex - A Lexical Analyzer Generator by M. E. Lesk and E. Schmidt ¤ÏÉ®¼Ô
2385 ¤¬°úÍѤ·¤¿¥É¥¥å¥á¥ó¥È¤Î¤Ò¤È¤Ä¤Ç¤¹¡£<url NAME="¤³¤³"
2386 url="http://www.cs.utexas.edu/users/novak/lexpaper.htm">¤ÇÆþ¼ê¤Ç¤¤Þ¤¹¡£
2389 Yacc: Yet Another Compiler-Compiler by Stephen C. Johnson is one of the
2390 original reference papers for YACC. It can be found
2391 <url NAME="here" url="http://www.cs.utexas.edu/users/novak/yaccpaper.htm">.
2392 It contains useful hints on style.
2395 Yacc: Yet Another Compiler-Compiler by Stephen C. Johnson ¤ÏÉ®¼Ô¤¬
2396 YACC ¤Ë¤Ä¤¤¤Æ°úÍѤ·¤¿¥É¥¥å¥á¥ó¥È¤Î°ì¤Ä¤Ç¤¹¡£
2398 url="http://www.cs.utexas.edu/users/novak/yaccpaper.htm">¤ÇÆþ¼ê¤Ç¤¤Þ
2399 ¤¹¡£¥¹¥¿¥¤¥ë¤Ë¤Ä¤¤¤Æ¤Î¥Ò¥ó¥È¤¬ºÜ¤Ã¤Æ¤¤¤Þ¤¹¡£
2403 Acknowledgements & Thanks
2410 <item>Pete Jinks <pjj@cs.man.ac.uk>
2411 <item>Chris Lattner <sabre@nondot.org>
2412 <item>John W. Millaway <johnmillaway@yahoo.com>
2413 <item>Martin Neitzel <neitzel@gaertner.de>
2414 <item>Esmond Pitt <esmond.pitt@bigpond.com>
2415 <item>Eric S. Raymond
2416 <item>Bob Schmertz <schmertz@wam.umd.edu>
2417 <item>Adam Sulmicki <adam@cfar.umd.edu>
2418 <item>Markus Triska <triska@gmx.at>
2419 <item>Erik Verbruggen <erik@road-warrior.cs.kun.nl>
2420 <item>Gary V. Vaughan <gary@gnu.org> (read his awesome <url NAME="Autobook"
2421 URL="http://sources.redhat.com/autobook">)
2422 <item><url NAME="Ivo van der Wijk" url="http://vanderwijk.info"> (
2423 <url NAME="Amaze Internet" url="http://www.amaze.nl">)
2426 ÌõÃí: ËÝÌõ¤Ë¤¢¤¿¤Ã¤Æ¤Ï¡¢»³²¼µÁÇ·¤µ¤ó¡¢¾®ÎÓ²íŵ¤µ¤ó¤Ëͱפʥ³¥á¥ó¥È¤ò¤¤
2427 ¤¿¤À¤¤Þ¤·¤¿¡£¤¢¤ê¤¬¤È¤¦¤´¤¶¤¤¤Þ¤·¤¿¡£