Removing whitespaces from a String

How to Remove Whitespace from a String in Ruby

Often, you want to sanitize a string variable by removing all leading and trailing whitespaces around it. This is usually needed when you receive the string input from a user or an external source. This post shows you how.

2 min read

The simplest way to remove leading and trailing whitespaces around a string is to use the String#strip function. It returns a copy of the string (without modifying the original) with whitespace removed from both ends.

puts "  hello ".strip

# "hello"

If you want to modify the original string, use the bang (!) version of the strip method as follows:

name = " Ruby "
puts name  # " Ruby "

puts name  # "Ruby"

In addition, Ruby also provides lstrip, lstrip!, rstrip, and rstrip! methods to remove whitespaces from only one end of the string.

input = "  Ruby on Rails     "

puts input.strip    # "Ruby on Rails"
puts input.lstrip   # "Ruby on Rails     "
puts input.rstrip   # "  Ruby on Rails"

Remove All Whitespace in a String

If the string contains extra whitespace inside, you can use the String#gsub method to replace it using a regex.

input = "  Ruby   on   Rails   "

puts input.gsub(/\s+/, "")    # RubyonRails

In the above code, the /\s+/ regular expression means one or more whitespace (spaces, newlines, or tabs).

Note how it removed all whitespace both around and inside the string, resulting in RubyonRails, which you may or may not want.

The squish method in Rails provides a nice alternative by removing only the extra space inside the string. That is, it first removes all whitespace on both ends of the string, and then changes remaining consecutive whitespace groups into one space each.

input = "  Ruby   on   Rails   "

input.squish    # "Ruby on Rails"

Pretty handy.

Bonus: If you want to remove just the last character in a string (doesn't matter if it's a whitespace or not), use the String#chop method.

puts "whoops".chop

# "whoop"

What is Considered a Whitespace in Strings?

In Ruby's String class, whitespace is defined as a contiguous sequence of characters consisting of any mixture of the following:

  • NL (null): "\x00""\u0000".
  • HT (horizontal tab): "\x09""\t".
  • LF (line feed): "\x0a""\n".
  • VT (vertical tab): "\x0b""\v".
  • FF (form feed): "\x0c""\f".
  • CR (carriage return): "\x0d""\r".
  • SP (space): "\x20"" ".

That's a wrap. I hope you found this article helpful and you learned something new.

As always, if you have any questions or feedback, didn't understand something, or found a mistake, please leave a comment below or send me an email. I reply to all emails I get from developers, and I look forward to hearing from you.

If you'd like to receive future articles directly in your email, please subscribe to my blog. If you're already a subscriber, thank you.