{{Short description|Macros whose expansion is guaranteed not to cause the capture of identifiers}} {{Technical|date=November 2016}} In computer science, '''hygienic macros''' are macros whose expansion is guaranteed not to cause the accidental capture of identifiers. They are a feature of programming languages such as Scheme,<ref name="r5rs" /> Dylan,<ref name="dylan">{{Citation |last1=Feinberg |first1=N. |last2=Keene |first2=S. E. |last3=Matthews |first3=R. O. |last4=Withington |first4=P. T. |title=Dylan programming: an object-oriented and dynamic language|publisher=Addison Wesley Longman Publishing Co., Inc. |year=1997}}</ref> Rust, Nim, and Julia. The general problem of accidental capture was well known in the Lisp community before the introduction of hygienic macros. Macro writers would use language features that would generate unique identifiers (e.g., gensym) or use obfuscated identifiers to avoid the problem. Hygienic macros are a programmatic solution to the capture problem that is integrated into the macro expander. The term "hygiene" was coined in Kohlbecker et al.'s 1986 paper that introduced hygienic macro expansion, inspired by terminology used in mathematics.<ref name="hygiene">{{cite conference |last1=Kohlbecker |first1=E. |last2=Friedman |first2=D. P. |last3=Felleisen |first3=M. |last4=Duba |first4=B. |year=1986 |title=Hygienic Macro Expansion |book-title=ACM conference on LISP and functional programming |url=http://www.cs.indiana.edu/pub/techreports/TR194.pdf}}</ref>

==The hygiene problem== === Variable shadowing === In programming languages that have non-hygienic macro systems, it is possible for existing variable bindings to be hidden from a macro by variable bindings that are created during its expansion. In C, this problem can be illustrated by the following fragment:

<syntaxhighlight lang="c"> #define INCI(i) { int a=0; ++i; } int main(void) { int a = 4, b = 8; INCI(a); INCI(b); printf("a is now %d, b is now %d\n", a, b); return 0; }</syntaxhighlight>

Running the above through the C preprocessor produces:

<syntaxhighlight lang="c"> int main(void) { int a = 4, b = 8; { int a = 0; ++a; }; { int a = 0; ++b; }; printf("a is now %d, b is now %d\n", a, b); return 0; }</syntaxhighlight>

The variable <code>a</code> declared in the top scope is shadowed by the <code>a</code> variable in the macro, which introduces a new scope. As a result, <code>a</code> is never altered by the execution of the program, as the output of the compiled program shows:

a is now 4, b is now 9

=== Standard library function redefinition === The hygiene problem can extend beyond variable bindings. Consider this Common Lisp macro:

<syntaxhighlight lang="lisp"> (defmacro my-unless (condition &body body) `(if (not ,condition) (progn ,@body))) </syntaxhighlight>

While there are no references to variables in this macro, it assumes the symbols "if", "not", and "progn" are all bound to their usual definitions in the standard library. If, however the above macro is used in the following code:

<syntaxhighlight lang="lisp"> (flet ((not (x) x)) (my-unless t (format t "This should not be printed!"))) </syntaxhighlight>

The definition of "not" has been locally altered and so the expansion of <code>my-unless</code> changes.

Note however that for Common Lisp this behavior is forbidden, as per [https://www.lispworks.com/documentation/lw70/CLHS/Body/11_abab.htm 11.1.2.1.2 Constraints on the COMMON-LISP Package for Conforming Programs]. It is also possible to completely redefine functions anyway. Some implementations of Common Lisp provide [https://www.sbcl.org/manual/#Package-Locks Package Locks] to prevent the user to change definitions in packages by mistake.

=== Program-defined function redefinition === Of course, the problem can occur for program-defined functions in a similar way:

<syntaxhighlight lang="lisp"> (defun user-defined-operator (cond) (not cond))

(defmacro my-unless (condition &body body) `(if (user-defined-operator ,condition) (progn ,@body)))

; ... later ...

(flet ((user-defined-operator (x) x)) (my-unless t (format t "This should not be printed!"))) </syntaxhighlight>

The use site redefines <code>user-defined-operator</code> and hence changes the behavior of the macro.

==Strategies used in languages that lack hygienic macros== The hygiene problem can be resolved with conventional macros using several alternative solutions.

=== Obfuscation === The simplest solution, if temporary storage is needed during macro expansion, is to use unusual variables names in the macro in hope that the same names will never be used by the rest of the program.

<syntaxhighlight lang="c"> #define INCI(i) { int INCIa = 0; ++i; } int main(void) { int a = 4, b = 8; INCI(a); INCI(b); printf("a is now %d, b is now %d\n", a, b); return 0; }</syntaxhighlight>

Until a variable named <code>INCIa</code> is created, this solution produces the correct output:

a is now 5, b is now 9

The problem is solved for the current program, but this solution is not robust. The variables used inside the macro and those in the rest of the program have to be kept in sync by the programmer. Specifically, using the macro <code>INCI</code> on a variable <code>INCIa</code> is going to fail in the same way that the original macro failed on a variable <code>a</code>.

=== <span class="anchor" id="Gensym"></span> Temporary symbol creation === In some programming languages, it is possible for a new variable name, or symbol, to be generated and bound to a temporary location. The language processing system ensures that this never clashes with another name or location in the execution environment. The responsibility for choosing to use this feature within the body of a macro definition is left to the programmer. This method was used in MacLisp, where a function named <code>gensym</code> could be used to generate a new symbol name. Similar functions (usually named <code>gensym</code> as well) exist in many Lisp-like languages, including the widely implemented Common Lisp standard<ref>{{Cite web |url=http://www.lispworks.com/documentation/HyperSpec/Body/f_gensym.htm#gensym |title=CLHS: Function GENSYM}}</ref> and Elisp.

Although symbol creation solves the variable shadowing issue, it does not directly solve the issue of function redefinition.<ref>{{cite web |title=hygiene-versus-gensym |url=http://community.schemewiki.org/?hygiene-versus-gensym |website=community.schemewiki.org |access-date=11 June 2022}}</ref> However, <code>gensym</code>, macro facilities, and standard library functions are sufficient to embed hygienic macros in an unhygienic language.<ref>{{cite journal |last1=Costanza |first1=Pascal |last2=D'Hondt |first2=Theo |title=Embedding Hygiene-Compatible Macros in an Unhygienic Macro System |journal=Journal of Universal Computer Science |date=2010 |volume=16 |issue=2 |pages=271–295 |doi=10.3217/jucs-016-02-0271 |doi-access=free|citeseerx=10.1.1.424.5218}}</ref>

=== Read-time uninterned symbol === This is similar to obfuscation in that a single name is shared by multiple expansions of the same macro. Unlike an unusual name, however, a read time uninterned symbol is used (denoted by the <code>#:</code> notation), for which it is impossible to occur outside of the macro, similar to <code>gensym</code>.

=== Packages === Using packages such as in Common Lisp, the macro simply uses a private symbol from the package in which the macro is defined. The symbol will not accidentally occur in user code. User code would have to reach inside the package using the double colon (<code>::</code>) notation to give itself permission to use the private symbol, for instance <code>cool-macros::secret-sym</code>. At that point, the issue of accidental lack of hygiene is moot. Furthermore the ANSI Common Lisp standard categorizes redefining standard functions and operators, globally or locally, as invoking undefined behavior. Such usage can be thus diagnosed by the implementation as erroneous. Thus the Lisp package system provide a viable, complete solution to the macro hygiene problem, which can be regarded as an instance of name clashing.

For example, in the program-defined function redefinition example, the <code>my-unless</code> macro can reside in its own package, where <code>user-defined-operator</code> is a private symbol in that package. The symbol <code>user-defined-operator</code> occurring in the user code will then be a different symbol, unrelated to the one used in the definition of the <code>my-unless</code> macro.

=== Literal objects === In some languages the expansion of a macro does not need to correspond to textual code; rather than expanding to an expression containing the symbol <code>f</code>, a macro may produce an expansion containing the actual object referred to by <code>f</code>. Similarly if the macro needs to use local variables or objects defined in the macro's package, it can expand to an invocation of a closure object whose enclosing lexical environment is that of the macro definition.

== Hygienic transformation == Hygienic macro systems in languages such as Scheme use a macro expansion process that preserves the lexical scoping of all identifiers and prevents accidental capture. This property is called referential transparency. In cases where capture is desired, some systems allow the programmer to explicitly violate the hygiene mechanisms of the macro system.

For example, Scheme's <code>let-syntax</code> and <code>define-syntax</code> macro creation systems are hygienic, so the following Scheme implementation of <code>my-unless</code> will have the desired behavior:

<syntaxhighlight lang="scheme"> (define-syntax my-unless (syntax-rules () ((_ condition body ...) (if (not condition) (begin body ...)))))

(let ((not (lambda (x) x))) (my-unless #t (display "This should not be printed!") (newline))) </syntaxhighlight>

The hygienic macro processor responsible for transforming the patterns of the input form into an output form detects symbol clashes and resolves them by temporarily changing the names of symbols. The basic strategy is to identify ''bindings'' in the macro definition and replace those names with gensyms, and to identify ''free variables'' in the macro definition and make sure those names are looked up in the scope of the macro definition instead of the scope where the macro was used.

== Implementations == Macro systems that automatically enforce hygiene originated with Scheme. The original KFFD algorithm for a hygienic macro system was presented by Kohlbecker in 1986.<ref name="hygiene" /> At the time, no standard macro system was adopted by Scheme implementations. Shortly thereafter in 1987, Kohlbecker and Wand proposed a declarative pattern-based language for writing macros, which was the predecessor to the <code>syntax-rules</code> macro facility adopted by the R5RS standard.<ref name="r5rs">{{cite journal |last1=Kelsey |first1=Richard |last2=Clinger |first2=William |last3=Rees |first3=Jonathan |last4=Rozas |first4=G.J. |last5=Adams Iv |first5=N.I. |last6=Friedman |first6=D.P. |last7=Kohlbecker |first7=E. |last8=Steele Jr. |first8=G.L. |last9=Bartley |first9=D.H. |display-authors=3 |date=August 1998 |title=Revised<sup>5</sup> Report on the Algorithmic Language Scheme |url=http://www.schemers.org/Documents/Standards/R5RS/ |journal=Higher-Order and Symbolic Computation |volume=11 |issue=1 |pages=7–105 |doi=10.1023/A:1010051815785 |url-access=subscription }}</ref><ref>{{cite conference |last1=Kohlbecker |first1=E. |last2=Wand |first2=M. |year=1987 |title=Macro-by-example: Deriving syntactic transformations from their specifications |book-title=Symposium on Principles of Programming Languages |url=http://jcmc.indiana.edu/pub/techreports/TR206.pdf}}</ref> Syntactic closures, an alternative hygiene mechanism, was proposed as an alternative to Kohlbecker et al.'s system by Bawden and Rees in '88.<ref name="syntactic-closures">{{cite conference |last1=Bawden |first1=A. |last2=Rees |first2=J. |year=1988 |title=Syntactic closures |book-title=Lisp and Functional Programming |url=https://apps.dtic.mil/dtic/tr/fulltext/u2/a195921.pdf |archive-url=https://web.archive.org/web/20190903013739/https://apps.dtic.mil/dtic/tr/fulltext/u2/a195921.pdf |url-status=live |archive-date=September 3, 2019}}</ref> Unlike the KFFD algorithm, syntactic closures require the programmer to explicitly specify the resolution of the scope of an identifier. In 1993, Dybvig et al. introduced the <code>syntax-case</code> macro system, which uses an alternative representation of syntax and maintains hygiene automatically.<ref name="syntax-case">{{cite journal |last1=Dybvig |first1=K |last2=Hieb |first2=R |last3=Bruggerman |first3=C |year=1993 |title=Syntactic abstraction in Scheme |journal=LISP and Symbolic Computation |volume=5 |issue=4 |pages=295–326 |url=http://www.cs.indiana.edu/~dyb/pubs/LaSC-5-4-pp295-326.pdf |doi=10.1007/BF01806308 |s2cid=15737919}}</ref> The <code>syntax-case</code> system can express the <code>syntax-rules</code> pattern language as a derived macro. The term ''macro system'' can be ambiguous because, in the context of Scheme, it can refer to both a pattern-matching construct (e.g., syntax-rules) and a framework for representing and manipulating syntax (e.g., syntax-case, syntactic closures).

=== Syntax-rules === Syntax-rules is a high-level pattern matching facility that attempts to make macros easier to write. However, <code>syntax-rules</code> is not able to succinctly describe certain classes of macros and is insufficient to express other macro systems. Syntax-rules was described in the R4RS document in an appendix but not mandated. Later, R5RS adopted it as a standard macro facility. Here is an example <code>syntax-rules</code> macro that swaps the value of two variables:

<syntaxhighlight lang="Scheme"> (define-syntax swap! (syntax-rules () ((_ a b) (let ((temp a)) (set! a b) (set! b temp))))) </syntaxhighlight>

=== Syntax-case === Due to the deficiencies of a purely <code>syntax-rules</code> based macro system, the R6RS Scheme standard adopted the syntax-case macro system.<ref name="r6rs">{{cite web |url=http://www.r6rs.org |title=Revised<sup>6</sup> Report on the Algorithmic Language Scheme (R6RS) |last1=Sperber |first1=Michael |last2=Dybvig |first2=R. Kent |last3=Flatt |first3=Matthew |last4=Van Straaten |first4=Anton |display-authors=etal |date=August 2007 |publisher=Scheme Steering Committee |access-date=2011-09-13}}</ref> Unlike <code>syntax-rules</code>, <code>syntax-case</code> contains both a pattern matching language and a low-level facility for writing macros. The former allows macros to be written declaratively, while the latter allows the implementation of alternative frontends for writing macros. The swap example from before is nearly identical in <code>syntax-case</code> because the pattern matching language is similar:

<syntaxhighlight lang="Scheme"> (define-syntax swap! (lambda (stx) (syntax-case stx () ((_ a b) (syntax (let ((temp a)) (set! a b) (set! b temp))))))) </syntaxhighlight>

However, <code>syntax-case</code> is more powerful than syntax-rules. For example, <code>syntax-case</code> macros can specify side-conditions on its pattern matching rules via arbitrary Scheme functions. Alternatively, a macro writer can choose not to use the pattern matching frontend and manipulate the syntax directly. Using the <code>datum->syntax</code> function, syntax-case macros can also intentionally capture identifiers, thus breaking hygiene.

=== Other systems === Other macro systems have also been proposed and implemented for Scheme. Syntactic closures and explicit renaming<ref>{{cite journal |last=Clinger |first=Will |journal=ACM SIGPLAN Lisp Pointers |volume=4 |issue=4 |pages=25–28 |year=1991 |title=Hygienic macros through explicit renaming |doi=10.1145/1317265.1317269 |s2cid=14628409}}</ref> are two alternative macro systems. Both systems are lower-level than syntax-rules and leave the enforcement of hygiene to the macro writer. This differs from both syntax-rules and syntax-case, which automatically enforce hygiene by default. The swap examples from above are shown here using a syntactic closure and explicit renaming implementation respectively:

<syntaxhighlight lang="Scheme"> ;; syntactic closures (define-syntax swap! (sc-macro-transformer (lambda (form environment) (let ((a (close-syntax (cadr form) environment)) (b (close-syntax (caddr form) environment))) `(let ((temp ,a)) (set! ,a ,b) (set! ,b temp))))))

;; explicit renaming (define-syntax swap! (er-macro-transformer (lambda (form rename compare) (let ((a (cadr form)) (b (caddr form)) (temp (rename 'temp))) `(,(rename 'let) ((,temp ,a)) (,(rename 'set!) ,a ,b) (,(rename 'set!) ,b ,temp)))))) </syntaxhighlight>

===Languages with hygienic macro systems=== * Scheme – syntax-rules, syntax-case, syntactic closures, and others. * Racket – a Scheme variant, its macro system was originally based on syntax-case, but now has more features. * Nemerle<ref name="nemerle">{{Citation |last1=Skalski |first1=K. |last2=Moskal |first2=M |last3=Olszta |first3=P |title=Metaprogramming in Nemerle |url=http://nemerle.org/metaprogramming.pdf |url-status=dead |archive-url=https://web.archive.org/web/20121113081854/http://nemerle.org/metaprogramming.pdf |archive-date=2012-11-13}}</ref> * Dylan * Elixir<ref>{{Cite web |url=http://elixir-lang.org/getting-started/meta/macros.html#macros-hygiene |title=Macros}}</ref> * Nim * Rust * Haxe * Mary2 – scoped macro bodies in an ALGOL 68-derivative language circa 1978 * Julia<ref>{{Cite web |url=http://docs.julialang.org/en/latest/manual/metaprogramming/#hygiene |title=Metaprogramming: the Julia Language |access-date=2014-03-03 |archive-date=2013-05-04 |archive-url=https://web.archive.org/web/20130504074021/http://docs.julialang.org/en/latest/manual/metaprogramming/#hygiene |url-status=dead}}</ref> * Raku – supports both hygienic and unhygienic macros<ref>{{Cite web |url=http://perlcabal.org/syn/S06.html#Macros |title=Synopsis 6: Subroutines |access-date=2014-06-03 |archive-url=https://web.archive.org/web/20140106032143/http://perlcabal.org/syn/S06.html#Macros |archive-date=2014-01-06 |url-status=dead}}</ref> * Lean<ref>{{cite arXiv |last1=Ullrich |first1=Sebastian |last2=de Moura |first2=Leonardo |date=2020-01-28 |title=Beyond Notations: Hygienic Macro Expansion for Theorem Proving Languages |arxiv=2001.10490 |class=cs}}</ref>

==Criticism== Hygienic macros offer safety and referential transparency at the expense of making intentional variable capture less straight-forward. Doug Hoyte, author of ''Let Over Lambda'', writes:<ref>[https://letoverlambda.com/index.cl/guest/chap3.html#sec_4], Let Over Lambda—50 Years of Lisp by Doug Hoyte</ref>

{{cquote|Almost all approaches taken to reducing the impact of variable capture serve only to reduce what you can do with defmacro. Hygienic macros are, in the best of situations, a beginner's safety guard-rail; in the worst of situations they form an electric fence, trapping their victims in a sanitised, capture-safe prison.|author=Doug Hoyte}}

Many hygienic macro systems do offer escape hatches without compromising on the guarantees that hygiene provides; for instance, Racket allows you to define [https://docs.racket-lang.org/reference/stxparam.html#%28tech._syntax._parameter%29 syntax parameters], which allow you to selectively introduce bound variables. Gregg Hendershott gives an example at Fear of Macros<ref>[https://www.greghendershott.com/fear-of-macros/Syntax_parameters.html], Fear of Macros</ref> of implementing an anaphoric if operator in this way.

==See also== * Anaphoric macro * Partial evaluation * Preprocessor * Syntactic closure

==Notes== {{Reflist|30em}}

==References== {{More footnotes needed|date=April 2012}} *''On Lisp'', Paul Graham *[http://community.schemewiki.org/?syntax-rules syntax-rules on schemewiki] *[http://community.schemewiki.org/?syntax-case syntax-case on schemewiki] *[http://community.schemewiki.org/?syntax-case-examples examples of syntax-case on schemewiki] *[http://community.schemewiki.org/?syntactic-closures syntactic closures on schemewiki] *[http://community.schemewiki.org/?simpler-macros simpler-macros on schemewiki] *[http://community.schemewiki.org/?simpler-macros-examples examples of simpler-macros on schemewiki] *[http://www.cs.indiana.edu/~dyb/pubs/tr356.pdf Writing Hygienic Macros in Scheme with Syntax-Case]

{{Programming paradigms navbox}}

{{DEFAULTSORT:Macros, Hygienic}} Category:Transformation languages Category:Scheme (programming language) Category:Dylan (programming language) Category:Metaprogramming