Stata | FAQ: Removing leading or trailing zeros from string variables

Home / Resources & support / FAQs / Removing leading or trailing zeros from string variables

How do I remove leading or trailing zeros from string variables?

Title		Removing leading or trailing zeros from string variables
Authors		Nicholas J. Cox, Durham University, UK Jamie Griffin, Imperial College London, UK

Suppose you wish to remove leading or trailing zeros from a string variable (or from a global or local macro). To be clear on terminology here, a string may contain zeros in leading positions, such as "0string"; in trailing positions, such as "string00"; in both; or in some intermediate position, such as "string000string". Here string stands for any string containing characters other than zero "0".

This problem is interesting because it provides an opportunity to consider the operation of various string functions. When we say zeros, in principle we could say any other character, except leading and trailing blanks should be removed using ustustrrtrim(), ustrltrim(), or ustrrtrim().

To make the problem as general and as challenging as possible, assume that we do not know in advance how many leading or trailing zeros there are and zeros may also occur in intermediate positions.

However, let us recall what to do if we do know how many leading or trailing zeros occur and zeros do not occur in intermediate positions. As we said, you can use these solutions with some modification to remove other characters. If we knew there was just one leading zero, and no other zero, we could remove it by changing it to an empty string using the usubinstr() function by typing either

        . replace myvar = usubinstr(myvar, "0", "", 1)

        . replace myvar = usubinstr(myvar, "0", "", .)

The last argument of usubinstr() is just the number of instances to be replaced, either the first or all of them, and has the same effect if at most one zero is present. This solution also applies to at most one trailing zero and to no others. However, usubinstr() can remove characters we want to keep.

Another line of attack is to use the usubstr() function. This works best if we know precisely how many characters to remove. Thus typing

        . replace myvar = usubstr(myvar, 2, .)

and

        . replace myvar = usubstr(myvar, 1, ustrlen(myvar) − 1)

respectively would remove the first and last character only. We could make that conditional on the character in question being "0", as in

        . replace myvar = usubstr(myvar, 2, .) if usubstr(myvar,1,1) == "0"

and

        . replace myvar = usubstr(myvar, 1, ustrlen(myvar) − 1) if
        > usubstr(myvar,−1,1) == "0"

which for our problem would do no harm and possibly some good.

However, we have yet to solve the general problem posed above. One idea is to keep removing zeros, so long as we find them, in a loop, as in

        . quietly count if usubstr(myvar, 1, 1) == "0" 
        . while r(N) { 
        .	replace myvar = usubstr(myvar, 2, .) if usubstr(myvar, 1, 1) == "0" 	
        .	count if usubstr(myvar, 1, 1) == "0" 
        . }

Here the result of count is left behind in r(N). If that is not zero, i.e., if r(N) is one or more, Stata enters the while loop. We remove the first zero whenever there is one and count again. Stata will exit the loop as soon as r(N) becomes zero. A similar technique could be used to remove an arbitrary number of trailing zeros, testing with usubstr(myvar, −1, 1) == "0" and removing with usubstr(myvar, 1, ustrlen(myvar) − 1).

This flexes a few small programming muscles but otherwise is less attractive than a better way that makes use of another function. indexnot() finds the position of the first character other than the one specified.

        . replace myvar = usubstr(myvar, indexnot(myvar, "0"), .)

Note that if myvar did not contain leading zeros, indexnot() would return 1 and myvar would just be replaced by itself. One way to find trailing zeros with this approach is to ustrreverse() the string before and after applying the same logic:

        . replace myvar = ustrreverse(usubstr(ustrreverse(myvar), indexnot(ustrreverse(myvar), "0"), .))

Cautious souls will want to see this done step by step:

        . generate rev = ustrreverse(myvar)
        . replace rev = usubstr(rev, indexnot(rev, "0"), .) 
        . replace myvar = ustrreverse(rev) 
        . drop rev

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

How do I remove leading or trailing zeros from string variables?

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

How do I remove leading or trailing zeros from string variables?

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies