Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Functions  



1.1  trim  





1.2  title  





1.3  sentence  





1.4  ucfirst  





1.5  findlast  





1.6  split  





1.7  stripZeros  





1.8  nowiki  





1.9  val2percent  





1.10  one2a  





1.11  findpagetext  





1.12  strip  





1.13  matchAny  





1.14  hyphen2dash  





1.15  startswith  







2 Usage  



2.1  Parameters  







3 Examples  



3.1  String split  





3.2  One2a  







4 See also  














Module:String2/sandbox







Add links
 









Module
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Get shortened URL
Download QR code
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 

< Module:String2

Module:String2 (edit | talk | history | links | watch | logs)

The module String2 contains a number of string manipulation functions that are much less commonly used than those in Module:String. Because Module:String is cascade-protected (some of its functions are used on the Main Page), it cannot be edited or maintained by template editors, only by admins. While it is true that string-handling functions rarely need maintenance, it is useful to allow that by template editors where possible, so this module may be used by template editors to develop novel functionality.

The module contains three case-related calls that convert strings to first letter uppercase, sentence case or title case and two calls that are useful for working with substrings. There are other utility calls that strip leading zeros from padded numbers and transform text so that it is not interpreted as wikitext, and several other calls that solve specific problems for template developers such as finding the position of a piece of text on a given page.

The functions are designed with the possibility of working with text returned from Wikidata in mind. However, a call to Wikidata may return empty, so the functions should generally fail gracefully if supplied with a missing or blank input parameter, rather than throwing an error.

Functions[edit]

trim[edit]

The trim function simply trims whitespace characters from the start and end of the string.

title[edit]

The title function capitalises the first letter of each word in the text, apart from a number of short words listed in The U.S. Government Printing Office Style Manual §3.49 "Center and side heads": a, an, the, at, by, for, in, of, on, to, up, and, as, but, or, and nor.

This is a very simplistic algorithm; see Template:Title case/doc for some of its limitations.

sentence[edit]

The sentence function finds the first letter and capitalises it, then renders the rest of the text in lower case. It works properly with text containing wiki markup. Compare {{#invoke:String2|sentence|[[action game]]}}Action game with {{ucfirst:{{lc:[[action game]]}}}}action game. Piped wiki-links are handled as well:

So are lists:

ucfirst[edit]

The ucfirst function is similar to sentence; it renders the first alphabetical character in upper case, but leaves the capitalisation of the rest of the text unaltered. This is useful if the text contains proper nouns, but it will not regularise sentences that are ALLCAPS, for example. It also works with text containing piped wiki-links and with html lists.

findlast[edit]

One potential issue is that using Lua special pattern characters (^$()%.[]*+-?) as the separator will probably cause problems.

Examples
Case Wikitext Output
Normal usage {{#invoke:String2 |findlast | 5, 932, 992,532, 6,074,702, 6,145,291}} 6,145,291
Space as separator {{#invoke:String2 |findlast | 5 932 992,532 6,074,702 6,145,291 }} 5 932 992,532 6,074,702 6,145,291
One item list {{#invoke:String2 |findlast | 6,074,702 }} 6,074,702
Separator not found {{#invoke:String2 |findlast | 5, 932, 992,532, 6,074,702, 6,145,291 |;}} 5, 932, 992,532, 6,074,702, 6,145,291
List missing {{#invoke:String2 |findlast |}}

split[edit]

The split function splits text at boundaries specified by separator and returns the chunk for the index idx (starting at 1). It can use positional parameters or named parameters (but these should not be mixed):

Usage
{{#invoke:String2 |split |text |separator |index |true/false}}
{{#invoke:String2 |split |txt=text |sep=separator |idx=index |plain=true/false}}

Any double quotes (") in the separator parameter are stripped out, which allows spaces and wikitext like ["[ to be passed. Use {{!}} for the pipe character |.

If the optional plain parameter is set to false / no / 0 then separator is treated as a Lua pattern. The default is plain=true, i.e. normal text matching.

The index parameter is optional; it defaults to the first chunk of text.

The {{string split}} is a convenience wrapper for the split function.

stripZeros[edit]

The stripZeros functions finds the first number in a string of text and strips leading zeros, but retains a zero which is followed by a decimal point. For example: "0940" → "940"; "Year: 0023" → "Year: 23"; "00.12" → "0.12"

nowiki[edit]

The nowiki function ensures that a string of text is treated by the MediaWiki software as just a string, not code. It trims leading and trailing whitespace.

val2percent[edit]

The val2percent functions scans through a string, passed as either the first unnamed parameter or |txt=, and converts each number it finds into a percentage, then returns the resulting string.

one2a[edit]

The one2a function scans through a string, passed as either the first unnamed parameter or |txt=, and converts each occurrence of 'one ' into either 'a ' or 'an ', then returns the resultant string.

The Template:One2a is a convenience wrapper for the one2a function.

findpagetext[edit]

The findpagetext function returns the position of a piece of text in the wikitext source of a page. It takes up to four parameters:

Examples
{{#invoke:String2 |findpagetext |text=Youghiogheny}}
{{#invoke:String2 |findpagetext |text=Youghiogheny |nomatch=not found}} → not found
{{#invoke:String2 |findpagetext |text=Youghiogheny |title=Boston Bridge |nomatch=not found}} → 296
{{#invoke:String2 |findpagetext |text=river |title=Boston Bridge |nomatch=not found}} → not found
{{#invoke:String2 |findpagetext |text=[Rr]iver |title=Boston Bridge |plain=false |nomatch=not found}} → 309
{{#invoke:String2 |findpagetext |text=%[%[ |title=Boston Bridge |plain=f |nomatch=not found}} → 294
{{#invoke:String2 |findpagetext |text=%{%{[Cc]oord |title=Boston Bridge |plain=f |nomatch=not found}} → 2470

The search is case-sensitive, so Lua pattern matching is needed to find riverorRiver. The last example finds {{coord and {{Coord. The penultimate example finds a wiki-link.

The Template:Findpagetext is a convenience wrapper for this function.

strip[edit]

The strip function strips the first positional parameter of the characters or pattern supplied in the second positional parameter.

Usage
{{#invoke:String2|strip|source_string|characters_to_strip|plain_flag}}
{{#invoke:String2|strip|source=|chars=|plain=}}
Examples
{{#invoke:String2|strip|abc123def|123}} → abcdef
{{#invoke:String2|strip|abc123def|%d+|false}} → abcdef
{{#invoke:String2|strip|source=abc123def|chars=123}} → abcdef
{{#invoke:String2|strip|source=abc123def|chars=%d+|plain=false}} → abcdef

matchAny[edit]

The matchAny function returns the index of the first positional parameter to match the source parameter. If the plain parameter is set to false (default true) then the search strings are Lua patterns. This can usefully be put in a switch statement to pick a switch case based on which pattern a string matches. Returns the empty string if nothing matches, for use in {{#if}}.

{{#invoke:String2|matchAny|123|abc|source=abc 124}} returns 2.

hyphen2dash[edit]

Extracted hyphen_to_dash() function from Module:Citation/CS1.

Converts a hyphen to a dash under certain conditions. The hyphen must separate like items; unlike items are returned unmodified. These forms are modified:

Any other forms are returned unmodified.

The input string may be a comma- or semicolon-separated list. Semicolons are converted to commas.

{{#invoke:String2|hyphen2dash|1=1-2}} returns 1–2.

{{#invoke:String2|hyphen2dash|1=1-2; 4–10}} returns 1–2, 4–10.

Accept-this-as-written markup is supported, e.g. {{#invoke:String2|hyphen2dash|1=((1-2)); 4–10}} returns 1-2, 4–10.

By default, a normal space is inserted after the separating comma in lists. An optional second parameter allows to change this to a different character (i.e. a thin space or hair space).

startswith[edit]

A startswith function similar to {{#invoke:string|endswith}}. Both parameters are required, although they can be blank. Leading and trailing whitespace is counted, use named parameters to avoid this if required. Outputs "yes" for true and blank for false so may be passed directly to #if.

Markup Renders as
{{#invoke:string2|startswith|search|se}}

yes

{{#invoke:string2|startswith|search|ch}}

Usage[edit]

Parameters[edit]

These functions take one unnamed parameter comprising (or invoking as a string) the text to be manipulated:

Examples[edit]

Input Output
{{#invoke:String2| ucfirst | abcd }} Abcd
{{#invoke:String2| ucfirst | abCD }} AbCD
{{#invoke:String2| ucfirst | ABcd }} ABcd
{{#invoke:String2| ucfirst | ABCD }} ABCD
{{#invoke:String2| ucfirst | 123abcd }} 123abcd
{{#invoke:String2| ucfirst | }}
{{#invoke:String2| ucfirst | human X chromosome }} Human X chromosome
{{#invoke:String2 | ucfirst | {{#invoke:WikidataIB |getValue
| P136 |fetchwikidata=ALL |onlysourced=no |qid=Q1396889}} }}
Roman à clef, satirical fiction, fable, dystopian fiction Edit this on Wikidata
{{#invoke:String2 | ucfirst | {{#invoke:WikidataIB |getValue
| P106 |fetchwikidata=ALL |list=hlist |qid=Q453196}} }}
  • university teacher
  • author
  • editor
  • educator Edit this on Wikidata
  •  
    {{#invoke:String2| sentence | abcd }} Abcd
    {{#invoke:String2| sentence | abCD }} Abcd
    {{#invoke:String2| sentence | ABcd }} Abcd
    {{#invoke:String2| sentence | ABCD }} Abcd
    {{#invoke:String2| sentence | [[action game]] }} Action game
    {{#invoke:String2| sentence | [[trimix (breathing gas)|trimix]] }} Trimix
    {{#invoke:String2| sentence | }}
     
    {{#invoke:String2| title | abcd }} Abcd
    {{#invoke:String2| title | abCD }} Abcd
    {{#invoke:String2| title | ABcd }} Abcd
    {{#invoke:String2| title | ABCD }} Abcd
    {{#invoke:String2| title | }}
    {{#invoke:String2| title | the vitamins are in my fresh california raisins}} The Vitamins Are in My Fresh California Raisins

    String split[edit]

    Template:String split is a convenience wrapper for the split function.

    Modules may return strings with | as separators like this: {{#invoke:carousel | main | name = WPDogs | switchsecs = 5 }} → Dalmatian liver stacked.jpg | Dalmatian dog stacked for show

    Lua patterns can allow splitting at classes of characters such as punctuation:

    Or split on anything that isn't a letter (no is treated as false):

    Named parameters force the trimming of leading and trailing spaces in the parameters and are generally clearer when used:

    One2a[edit]

    Template:One2a is a convenience wrapper for the one2a function.

    Capitalisation is kept. Aimed for usage with {{Convert}}.

    A foot. A mile. A kilometer. An inch.An amp. a foot. a mile. an inch. Alone at last. Onely the lonely. ONE ounce. A monkey.

    See also[edit]

    Module:String for the following functions:

    Templates and modules related to capitalization

    Magic words that rewrite the output (copy-paste will get the text as displayed, not as entered):


    Templates that implement <nowiki>

    require ('strict');
    local p = {}
    
    p.trim = function(frame)
     return mw.text.trim(frame.args[1] or "")
    end
    
    p.sentence = function (frame)
     -- {{lc:}} is strip-marker safe, string.lower is not.
     frame.args[1] = frame:callParserFunction('lc', frame.args[1])
     return p.ucfirst(frame)
    end
    
    p.ucfirst = function (frame )
     local s = frame.args[1];
     if not s or '' == s or s:match ('^%s+$') then        -- when <s> is nil, empty, or only whitespace
      return s;                -- abandon because nothing to do
     end
    
     s =  mw.text.trim( frame.args[1] or "" )
     local s1 = ""
    
     local prefix_patterns_t = {             -- sequence of prefix patterns
      '^\127[^\127]*UNIQ%-%-%a+%-%x+%-QINU[^\127]*\127',      -- stripmarker
      '^([%*;:#]+)',               -- various list markup
      '^(\'\'\'*)',               -- bold / italic markup
      '^(%b<>)',                -- html-like tags because some templates render these
      '^(&%a+;)',                -- html character entities because some templates render these
      '^(&#%d+;)',               -- html numeric (decimal) entities because some templates render these
      '^(&#x%x+;)',               -- html numeric (hexadecimal) entities because some templates render these
      '^(%s+)',                -- any whitespace characters
      '^([%(%)%-%+%?%.%%!~!@%$%^&_={}/`,‘’„“”ʻ|\"\'\\]+)',     -- miscellaneous punctuation
      }
     
     local prefixes_t = {};              -- list, bold/italic, and html-like markup, & whitespace saved here
    
     local function prefix_strip (s)            -- local function to strip prefixes from <s>
      for _, pattern in ipairs (prefix_patterns_t) do       -- spin through <prefix_patterns_t> 
       if s:match (pattern) then           -- when there is a match
        local prefix = s:match (pattern);        -- get a copy of the matched prefix
        table.insert (prefixes_t, prefix);        -- save it
        s = s:sub (prefix:len() + 1);         -- remove the prefix from <s>
        return s, true;             -- return <s> without prefix and flag; force restart at top of sequence because misc punct removal can break stripmarker
       end
      end
      return s;                -- no prefix found; return <s> with nil flag
     end
    
     local prefix_removed;              -- flag; boolean true as long as prefix_strip() finds and removes a prefix
     
     repeat                  -- one by one remove list, bold/italic, html-like markup, whitespace, etc from start of <s>
      s, prefix_removed = prefix_strip (s);
     until (not prefix_removed);             -- until <prefix_removed> is nil
    
     s1 = table.concat (prefixes_t);            -- recreate the prefix string for later reattachment
    
     local first_text = mw.ustring.match (s, '^%[%[[^%]]+%]%]');     -- extract wikilink at start of string if present; TODO: this can be string.match()?
    
     local upcased;
     if first_text then
      if first_text:match ('^%[%[[^|]+|[^%]]+%]%]') then      -- if <first_text> is a piped link
       upcased = mw.ustring.match (s, '^%[%[[^|]+|%W*(%w)');    -- get first letter character
       upcased = mw.ustring.upper (upcased);        -- upcase first letter character
       s = mw.ustring.gsub (s, '^(%[%[[^|]+|%W*)%w', '%1' .. upcased);  -- replace
      else                 -- here when <first_text> is a wikilink but not a piped link
       upcased = mw.ustring.match (s, '^%[%[%W*%w');      -- get '[[' and first letter
       upcased = mw.ustring.upper (upcased);        -- upcase first letter character
       s = mw.ustring.gsub (s, '^%[%[%W*%w', upcased);      -- replace; no capture needed here
      end
    
     elseif s:match ('^%[%S+%s+[^%]]+%]') then         -- if <s> is a ext link of some sort; must have label text
      upcased = mw.ustring.match (s, '^%[%S+%s+%W*(%w)');      -- get first letter character
      upcased = mw.ustring.upper (upcased);         -- upcase first letter character
      s = mw.ustring.gsub (s, '^(%[%S+%s+%W*)%w', '%1' .. upcased);   -- replace
     
     elseif s:match ('^%[%S+%s*%]') then           -- if <s> is a ext link without label text; nothing to do
      return s1 .. s;               -- reattach prefix string (if present) and done
    
     else                  -- <s> is not a wikilink or ext link; assume plain text
      upcased = mw.ustring.match (s, '^%W*%w');        -- get the first letter character
      upcased = mw.ustring.upper (upcased);         -- upcase first letter character
      s = mw.ustring.gsub (s, '^%W*%w', upcased);        -- replace; no capture needed here
     end
    
     return s1 .. s;                -- reattach prefix string (if present) and done
    end
    
    
    p.title = function (frame )
     -- http://grammar.yourdictionary.com/capitalization/rules-for-capitalization-in-titles.html
     -- recommended by The U.S. Government Printing Office Style Manual:
     -- "Capitalize all words in titles of publications and documents,
     -- except a, an, the, at, by, for, in, of, on, to, up, and, as, but, or, and nor."
     local alwayslower = {['a'] = 1, ['an'] = 1, ['the'] = 1,
      ['and'] = 1, ['but'] = 1, ['or'] = 1, ['for'] = 1,
      ['nor'] = 1, ['on'] = 1, ['in'] = 1, ['at'] = 1, ['to'] = 1,
      ['from'] = 1, ['by'] = 1, ['of'] = 1, ['up'] = 1 }
     local res = ''
     local s =  mw.text.trim( frame.args[1] or "" )
     local words = mw.text.split( s, " ")
     for i, s in ipairs(words) do
      -- {{lc:}} is strip-marker safe, string.lower is not.
      s = frame:callParserFunction('lc', s)
      if i == 1 or alwayslower[s] ~= 1 then
       s = mw.getContentLanguage():ucfirst(s)
      end
      words[i] = s
     end
     return table.concat(words, " ")
    end
    
    -- findlast finds the last item in a list
    -- the first unnamed parameter is the list
    -- the second, optional unnamed parameter is the list separator (default = comma space)
    -- returns the whole list if separator not found
    p.findlast = function(frame)
     local s =  mw.text.trim( frame.args[1] or "" )
     local sep = frame.args[2] or ""
     if sep == "" then sep = ", " end
     local pattern = ".*" .. sep .. "(.*)"
     local a, b, last = s:find(pattern)
     if a then
      return last
     else
      return s
     end
    end
    
    -- stripZeros finds the first number and strips leading zeros (apart from units)
    -- e.g "0940" -> "940"; "Year: 0023" -> "Year: 23"; "00.12" -> "0.12"
    p.stripZeros = function(frame)
     local s = mw.text.trim(frame.args[1] or "")
     local n = tonumber( string.match( s, "%d+" ) ) or ""
     s = string.gsub( s, "%d+", n, 1 )
     return s
    end
    
    -- nowiki ensures that a string of text is treated by the MediaWiki software as just a string
    -- it takes an unnamed parameter and trims whitespace, then removes any wikicode
    p.nowiki = function(frame)
     local str = mw.text.trim(frame.args[1] or "")
     return mw.text.nowiki(str)
    end
    
    -- split splits text at boundaries specified by separator
    -- and returns the chunk for the index idx (starting at 1)
    -- #invoke:String2 |split |text |separator |index |true/false
    -- #invoke:String2 |split |txt=text |sep=separator |idx=index |plain=true/false
    -- if plain is false/no/0 then separator is treated as a Lua pattern - defaults to plain=true
    p.split = function(frame)
     local args = frame.args
     if not(args[1] or args.txt) then args = frame:getParent().args end
     local txt = args[1] or args.txt or ""
     if txt == "" then return nil end
     local sep = (args[2] or args.sep or ""):gsub('"', '')
     local idx = tonumber(args[3] or args.idx) or 1
     local plain = (args[4] or args.plain or "true"):sub(1,1)
     plain = (plain ~= "f" and plain ~= "n" and plain ~= "0")
     local splittbl = mw.text.split( txt, sep, plain )
     if idx < 0 then idx = #splittbl + idx + 1 end
     return splittbl[idx]
    end
    
    -- val2percent scans through a string, passed as either the first unnamed parameter or |txt=
    -- it converts each number it finds into a percentage and returns the resultant string.
    p.val2percent = function(frame)
     local args = frame.args
     if not(args[1] or args.txt) then args = frame:getParent().args end
     local txt = mw.text.trim(args[1] or args.txt or "")
     if txt == "" then return nil end
     local function v2p (x)
      x = (tonumber(x) or 0) * 100
      if x == math.floor(x) then x = math.floor(x) end
      return x .. "%"
     end
     txt = txt:gsub("%d[%d%.]*", v2p) -- store just the string
     return txt
    end
    
    -- one2a scans through a string, passed as either the first unnamed parameter or |txt=
    -- it converts each occurrence of 'one ' into either 'a ' or 'an ' and returns the resultant string.
    p.one2a = function(frame)
     local args = frame.args
     if not(args[1] or args.txt) then args = frame:getParent().args end
     local txt = mw.text.trim(args[1] or args.txt or "")
     if txt == "" then return nil end
     txt = txt:gsub(" one ", " a "):gsub("^one", "a"):gsub("One ", "A "):gsub("a ([aeiou])", "an %1"):gsub("A ([aeiou])", "An %1")
     return txt
    end
    
    -- findpagetext returns the position of a piece of text in a page
    -- First positional parameter or |text is the search text
    -- Optional parameter |title is the page title, defaults to current page
    -- Optional parameter |plain is either true for plain search (default) or false for Lua pattern search
    -- Optional parameter |nomatch is the return value when no match is found; default is nil
    p._findpagetext = function(args)
     -- process parameters
     local nomatch = args.nomatch or ""
     if nomatch == "" then nomatch = nil end
     --
     local text = mw.text.trim(args[1] or args.text or "")
     if text == "" then return nil end
     --
     local title = args.title or ""
     local titleobj
     if title == "" then
      titleobj = mw.title.getCurrentTitle()
     else
      titleobj = mw.title.new(title)
     end
     --
     local plain = args.plain or ""
     if plain:sub(1, 1) == "f" then plain = false else plain = true end
     -- get the page content and look for 'text' - return position or nomatch
     local content = titleobj and titleobj:getContent()
     return content and mw.ustring.find(content, text, 1, plain) or nomatch
    end
    p.findpagetext = function(frame)
     local args = frame.args
     local pargs = frame:getParent().args
     for k, v in pairs(pargs) do
      args[k] = v
     end
     if not (args[1] or args.text) then return nil end
     -- just the first value
     return (p._findpagetext(args))
    end
    
    -- returns the decoded url. Inverse of parser function {{urlencode:val|TYPE}}
    -- Type is:
    -- QUERY decodes + to space (default)
    -- PATH does no extra decoding
    -- WIKI decodes _ to space
    p._urldecode = function(url, type)
     url = url or ""
     type = (type == "PATH" or type == "WIKI") and type
     return mw.uri.decode( url, type )
    end
    -- {{#invoke:String2|urldecode|url=url|type=type}}
    p.urldecode = function(frame)
     return mw.uri.decode( frame.args.url, frame.args.type )
    end
    
    -- what follows was merged from Module:StringFunc
    
    -- helper functions
    p._GetParameters = require('Module:GetParameters')
    
    -- Argument list helper function, as per Module:String
    p._getParameters = p._GetParameters.getParameters
    
    -- Escape Pattern helper function so that all characters are treated as plain text, as per Module:String
    function p._escapePattern( pattern_str )
     return mw.ustring.gsub( pattern_str, "([%(%)%.%%%+%-%*%?%[%^%$%]])", "%%%1" )
    end
    
    -- Helper Function to interpret boolean strings, as per Module:String
    p._getBoolean = p._GetParameters.getBoolean
    
    --[[
    Strip
    
    This function Strips characters from string
    
    Usage:
    {{#invoke:String2|strip|source_string|characters_to_strip|plain_flag}}
    
    Parameters
     source: The string to strip
     chars:  The pattern or list of characters to strip from string, replaced with ''
     plain:  A flag indicating that the chars should be understood as plain text. defaults to true.
    
    Leading and trailing whitespace is also automatically stripped from the string.
    ]]
    function p.strip( frame )
     local new_args = p._getParameters( frame.args,  {'source', 'chars', 'plain'} )
     local source_str = new_args['source'] or ''
     local chars = new_args['chars'] or '' or 'characters'
     source_str = mw.text.trim(source_str)
     if source_str == '' or chars == '' then
      return source_str
     end
     local l_plain = p._getBoolean( new_args['plain'] or true )
     if l_plain then
      chars = p._escapePattern( chars )
     end
     local result
     result = mw.ustring.gsub(source_str, "["..chars.."]", '')
     return result
    end
    
    --[[
    Match any
    Returns the index of the first given pattern to match the input. Patterns must be consecutively numbered.
    Returns the empty string if nothing matches for use in {{#if:}}
    
    Usage:
     {{#invoke:String2|matchAll|source=123 abc|456|abc}} returns '2'.
    
    Parameters:
     source: the string to search
     plain:  A flag indicating that the patterns should be understood as plain text. defaults to true.
     1, 2, 3, ...: the patterns to search for
    ]]
    function p.matchAny(frame)
     local source_str = frame.args['source'] or error('The source parameter is mandatory.')
     local l_plain = p._getBoolean( frame.args['plain'] or true )
     for i = 1, math.huge do
      local pattern = frame.args[i]
      if not pattern then return '' end
      if mw.ustring.find(source_str, pattern, 1, l_plain) then
       return tostring(i)
      end
     end
    end
    
    --[[--------------------------< H Y P H E N _ T O _ D A S H >--------------------------------------------------
    
    Converts a hyphen to a dash under certain conditions.  The hyphen must separate
    like items; unlike items are returned unmodified.  These forms are modified:
     letter - letter (A - B)
     digit - digit (4-5)
     digit separator digit - digit separator digit (4.1-4.5 or 4-1-4-5)
     letterdigit - letterdigit (A1-A5) (an optional separator between letter and
      digit is supported – a.1-a.5 or a-1-a-5)
     digitletter - digitletter (5a - 5d) (an optional separator between letter and
      digit is supported – 5.a-5.d or 5-a-5-d)
    
    any other forms are returned unmodified.
    
    str may be a comma- or semicolon-separated list
    
    ]]
    function p.hyphen_to_dash( str, spacing )
     if (str == nil or str == '') then
      return str
     end
    
     local accept
    
     str = mw.text.decode(str, true )           -- replace html entities with their characters; semicolon mucks up the text.split
    
     local out = {}
     local list = mw.text.split (str, '%s*[,;]%s*')        -- split str at comma or semicolon separators if there are any
    
     for _, item in ipairs (list) do            -- for each item in the list
      item = mw.text.trim(item)            -- trim whitespace
      item, accept = item:gsub ('^%(%((.+)%)%)$', '%1')
      if accept == 0 and mw.ustring.match (item, '^%w*[%.%-]?%w+%s*[%-–—]%s*%w*[%.%-]?%w+$') then -- if a hyphenated range or has endash or emdash separators
       if item:match ('^%a+[%.%-]?%d+%s*%-%s*%a+[%.%-]?%d+$') or   -- letterdigit hyphen letterdigit (optional separator between letter and digit)
        item:match ('^%d+[%.%-]?%a+%s*%-%s*%d+[%.%-]?%a+$') or   -- digitletter hyphen digitletter (optional separator between digit and letter)
        item:match ('^%d+[%.%-]%d+%s*%-%s*%d+[%.%-]%d+$') or   -- digit separator digit hyphen digit separator digit
        item:match ('^%d+%s*%-%s*%d+$') or        -- digit hyphen digit
        item:match ('^%a+%s*%-%s*%a+$') then       -- letter hyphen letter
         item = item:gsub ('(%w*[%.%-]?%w+)%s*%-%s*(%w*[%.%-]?%w+)', '%1–%2') -- replace hyphen, remove extraneous space characters
       else
        item = mw.ustring.gsub (item, '%s*[–—]%s*', '–')    -- for endash or emdash separated ranges, replace em with en, remove extraneous whitespace
       end
      end
      table.insert (out, item)            -- add the (possibly modified) item to the output table
     end
    
     local temp_str = table.concat (out, ',' .. spacing)       -- concatenate the output table into a comma separated string
     temp_str, accept = temp_str:gsub ('^%(%((.+)%)%)$', '%1')     -- remove accept-this-as-written markup when it wraps all of concatenated out
     if accept ~= 0 then
      temp_str = str:gsub ('^%(%((.+)%)%)$', '%1')       -- when global markup removed, return original str; do it this way to suppress boolean second return value
     end
     return temp_str
    end
    
    function p.hyphen2dash( frame )
     local str = frame.args[1] or ''
     local spacing = frame.args[2] or ' ' -- space is part of the standard separator for normal spacing (but in conjunction with templates r/rp/ran we may need a narrower spacing
    
     return p.hyphen_to_dash(str, spacing)
    end
    
    -- Similar to [[Module:String#endswith]]
    function p.startswith(frame)
     return (frame.args[1]:sub(1, frame.args[2]:len()) == frame.args[2]) and 'yes' or ''
    end
    
    return p
    

    Retrieved from "https://en.wikipedia.org/w/index.php?title=Module:String2/sandbox&oldid=1213022240"

    Category: 
    Module sandboxes
     



    This page was last edited on 10 March 2024, at 18:45 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki