Documentation
¶
Overview ¶
Package coregex provides a high-performance regex engine for Go.
coregex achieves 5-50x speedup over Go's stdlib regexp through:
- Multi-engine architecture (NFA, Lazy DFA, prefilters)
- SIMD-accelerated primitives (memchr, memmem, teddy)
- Literal extraction and prefiltering
- Automatic strategy selection
The public API is compatible with stdlib regexp where possible, making it easy to migrate existing code.
Basic usage:
// Compile a pattern
re, err := coregex.Compile(`\d+`)
if err != nil {
log.Fatal(err)
}
// Find first match
match := re.Find([]byte("hello 123 world"))
fmt.Println(string(match)) // "123"
// Check if matches
if re.Match([]byte("hello 123")) {
fmt.Println("matched!")
}
Advanced usage:
// Custom configuration
config := coregex.DefaultConfig()
config.MaxDFAStates = 50000
re, err := coregex.CompileWithConfig("(a|b|c)*", config)
Performance characteristics:
- Patterns with literals: 5-50x faster (prefilter optimization)
- Simple patterns: comparable to stdlib
- Complex patterns: 2-10x faster (DFA avoids backtracking)
- Worst case: guaranteed O(m*n) (ReDoS safe)
Limitations (v1.0):
- No capture groups (coming in v1.1)
- No replace functions (coming in v1.1)
- No multiline/case-insensitive flags (coming in v1.1)
Index ¶
- func DefaultConfig() meta.Config
- func Match(pattern string, b []byte) (matched bool, err error)
- func MatchReader(pattern string, r io.RuneReader) (matched bool, err error)
- func MatchString(pattern string, s string) (matched bool, err error)
- func QuoteMeta(s string) string
- type Regex
- func (r *Regex) Copy() *Regexdeprecated
- func (r *Regex) Count(b []byte, n int) int
- func (r *Regex) CountString(s string, n int) int
- func (r *Regex) Expand(dst []byte, template []byte, src []byte, match []int) []byte
- func (r *Regex) ExpandString(dst []byte, template string, src string, match []int) []byte
- func (r *Regex) Find(b []byte) []byte
- func (r *Regex) FindAll(b []byte, n int) [][]byte
- func (r *Regex) FindAllIndex(b []byte, n int) [][]int
- func (r *Regex) FindAllIndexCompact(b []byte, n int, results [][2]int) [][2]int
- func (r *Regex) FindAllString(s string, n int) []string
- func (r *Regex) FindAllStringIndex(s string, n int) [][]int
- func (r *Regex) FindAllStringIndexCompact(s string, n int, results [][2]int) [][2]int
- func (r *Regex) FindAllStringSubmatch(s string, n int) [][]string
- func (r *Regex) FindAllStringSubmatchIndex(s string, n int) [][]int
- func (r *Regex) FindAllSubmatch(b []byte, n int) [][][]byte
- func (r *Regex) FindAllSubmatchIndex(b []byte, n int) [][]int
- func (r *Regex) FindIndex(b []byte) []int
- func (r *Regex) FindReaderIndex(reader io.RuneReader) []int
- func (r *Regex) FindReaderSubmatchIndex(reader io.RuneReader) []int
- func (r *Regex) FindString(s string) string
- func (r *Regex) FindStringIndex(s string) []int
- func (r *Regex) FindStringSubmatch(s string) []string
- func (r *Regex) FindStringSubmatchIndex(s string) []int
- func (r *Regex) FindSubmatch(b []byte) [][]byte
- func (r *Regex) FindSubmatchIndex(b []byte) []int
- func (r *Regex) LiteralPrefix() (prefix string, complete bool)
- func (r *Regex) Longest()
- func (r *Regex) MarshalText() ([]byte, error)
- func (r *Regex) Match(b []byte) bool
- func (r *Regex) MatchReader(reader io.RuneReader) bool
- func (r *Regex) MatchString(s string) bool
- func (r *Regex) NumSubexp() int
- func (r *Regex) ReplaceAll(src, repl []byte) []byte
- func (r *Regex) ReplaceAllFunc(src []byte, repl func([]byte) []byte) []byte
- func (r *Regex) ReplaceAllLiteral(src, repl []byte) []byte
- func (r *Regex) ReplaceAllLiteralString(src, repl string) string
- func (r *Regex) ReplaceAllString(src, repl string) string
- func (r *Regex) ReplaceAllStringFunc(src string, repl func(string) string) string
- func (r *Regex) Split(s string, n int) []string
- func (r *Regex) String() string
- func (r *Regex) SubexpIndex(name string) int
- func (r *Regex) SubexpNames() []string
- func (r *Regex) UnmarshalText(text []byte) error
- type Regexp
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func DefaultConfig ¶
DefaultConfig returns the default configuration for compilation.
Users can customize this and pass to CompileWithConfig.
Example:
config := coregex.DefaultConfig()
config.EnableDFA = false // Use NFA only
re, _ := coregex.CompileWithConfig("pattern", config)
func Match ¶ added in v0.10.7
Match reports whether the byte slice b contains any match of the regular expression pattern. More complicated queries need to use Compile and the full Regexp interface.
func MatchReader ¶ added in v0.10.7
func MatchReader(pattern string, r io.RuneReader) (matched bool, err error)
MatchReader reports whether the text returned by the RuneReader contains any match of the regular expression pattern. More complicated queries need to use Compile and the full Regexp interface.
func MatchString ¶ added in v0.10.7
MatchString reports whether the string s contains any match of the regular expression pattern. More complicated queries need to use Compile and the full Regexp interface.
func QuoteMeta ¶ added in v0.8.2
QuoteMeta returns a string that escapes all regular expression metacharacters inside the argument text; the returned string is a regular expression matching the literal text.
Example:
escaped := coregex.QuoteMeta("hello.world")
// escaped = "hello\\.world"
re := coregex.MustCompile(escaped)
re.MatchString("hello.world") // true
Types ¶
type Regex ¶
type Regex struct {
// contains filtered or unexported fields
}
Regex represents a compiled regular expression.
A Regex is safe to use concurrently from multiple goroutines, except for methods that modify internal state (like ResetStats).
Example:
re := coregex.MustCompile(`hello`)
if re.Match([]byte("hello world")) {
println("matched!")
}
func Compile ¶
Compile compiles a regular expression pattern.
Syntax is Perl-compatible (same as Go's stdlib regexp). Returns an error if the pattern is invalid.
Example:
re, err := coregex.Compile(`\d{3}-\d{4}`)
if err != nil {
log.Fatal(err)
}
Example ¶
ExampleCompile demonstrates basic pattern compilation and matching.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re, err := coregex.Compile(`\d+`)
if err != nil {
panic(err)
}
fmt.Println(re.Match([]byte("hello 123")))
}
Output: true
func CompilePOSIX ¶ added in v0.10.7
CompilePOSIX is like Compile but restricts the regular expression to POSIX ERE (egrep) syntax and changes the match semantics to leftmost-longest.
That is, when matching against text, the regexp returns a match that begins as early as possible in the input (leftmost), and among those it chooses a match that is as long as possible. This so-called leftmost-longest matching is the same semantics that early regular expression implementations used and that POSIX specifies.
func CompileWithConfig ¶
CompileWithConfig compiles a pattern with custom configuration.
This allows fine-tuning of performance characteristics.
Example:
config := coregex.DefaultConfig()
config.MaxDFAStates = 100000 // Larger cache
re, err := coregex.CompileWithConfig("(a|b|c)*", config)
Example ¶
ExampleCompileWithConfig demonstrates custom configuration.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
config := coregex.DefaultConfig()
config.MaxDFAStates = 50000 // Increase cache size
re, err := coregex.CompileWithConfig("(a|b|c)*", config)
if err != nil {
panic(err)
}
fmt.Println(re.MatchString("abcabc"))
}
Output: true
func MustCompile ¶
MustCompile compiles a regular expression pattern and panics if it fails.
This is useful for patterns known to be valid at compile time.
Example:
var emailRegex = coregex.MustCompile(`[a-z]+@[a-z]+\.[a-z]+`)
Example ¶
ExampleMustCompile demonstrates panic-on-error compilation.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re := coregex.MustCompile(`hello`)
fmt.Println(re.MatchString("hello world"))
}
Output: true
func MustCompilePOSIX ¶ added in v0.10.7
MustCompilePOSIX is like CompilePOSIX but panics if the expression cannot be parsed. It simplifies safe initialization of global variables holding compiled regular expressions.
func (*Regex) Copy
deprecated
added in
v0.10.7
Copy returns a new Regex object copied from re. Calling Longest on one copy does not affect another.
Deprecated: In earlier releases, when using a Regexp in multiple goroutines, giving each goroutine its own copy helped to avoid lock contention. As of Go 1.12, using Copy is no longer necessary to avoid lock contention. Copy may still be appropriate if the reason for its use is to make two copies with different Longest settings.
func (*Regex) Count ¶ added in v0.4.0
Count returns the number of non-overlapping matches of the pattern in b. If n > 0, counts at most n matches. If n <= 0, counts all matches.
This is optimized for counting without building result slices.
Example:
re := coregex.MustCompile(`\d+`)
count := re.Count([]byte("1 2 3 4 5"), -1)
// count == 5
func (*Regex) CountString ¶ added in v0.4.0
CountString returns the number of non-overlapping matches of the pattern in s. If n > 0, counts at most n matches. If n <= 0, counts all matches.
Example:
re := coregex.MustCompile(`\d+`)
count := re.CountString("1 2 3 4 5", -1)
// count == 5
func (*Regex) Expand ¶ added in v0.10.7
Expand appends template to dst and returns the result; during the append, Expand replaces variables in the template with corresponding matches drawn from src. The match slice should contain the progressively numbered submatches as returned by FindSubmatchIndex.
In the template, a variable is denoted by a substring of the form $name or ${name}, where name is a non-empty sequence of letters, digits, and underscores. A purely numeric name like $1 refers to the submatch with the corresponding index; other names refer to capturing parentheses named with the (?P<name>...) syntax. A reference to an out of range or unmatched index or a name that is not present in the regular expression is replaced with an empty slice.
In the $name form, name is taken to be as long as possible: $1x is equivalent to ${1x}, not ${1}x, and, $10 is equivalent to ${10}, not ${1}0.
To insert a literal $ in the output, use $$ in the template.
func (*Regex) ExpandString ¶ added in v0.10.7
ExpandString is like Expand but the template and source are strings. It appends to and returns a byte slice in order to give the caller control over allocation.
func (*Regex) Find ¶
Find returns a slice holding the text of the leftmost match in b. Returns nil if no match is found.
Example:
re := coregex.MustCompile(`\d+`)
match := re.Find([]byte("age: 42"))
println(string(match)) // "42"
Example ¶
ExampleRegex_Find demonstrates finding the first match.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re := coregex.MustCompile(`\d+`)
match := re.Find([]byte("age: 42 years"))
fmt.Println(string(match))
}
Output: 42
func (*Regex) FindAll ¶
FindAll returns a slice of all successive matches of the pattern in b. If n > 0, it returns at most n matches. If n <= 0, it returns all matches.
Example:
re := coregex.MustCompile(`\d+`)
matches := re.FindAll([]byte("1 2 3"), -1)
// matches = [[]byte("1"), []byte("2"), []byte("3")]
Example ¶
ExampleRegex_FindAll demonstrates finding all matches.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re := coregex.MustCompile(`\d`)
matches := re.FindAll([]byte("a1b2c3"), -1)
for _, m := range matches {
fmt.Print(string(m), " ")
}
fmt.Println()
}
Output: 1 2 3
func (*Regex) FindAllIndex ¶ added in v0.3.0
FindAllIndex returns a slice of all successive matches of the pattern in b, as index pairs [start, end]. If n > 0, it returns at most n matches. If n <= 0, it returns all matches.
Example:
re := coregex.MustCompile(`\d+`)
indices := re.FindAllIndex([]byte("1 2 3"), -1)
// indices = [[0,1], [2,3], [4,5]]
func (*Regex) FindAllIndexCompact ¶ added in v0.10.6
FindAllIndexCompact returns all successive matches as a compact [][2]int slice. This is a zero-allocation API (single allocation for the result slice). Unlike FindAllIndex which returns [][]int (N allocations for N matches), this method pre-allocates the entire result in one contiguous block.
Performance: ~2x fewer allocations than FindAllIndex for high match counts.
If n > 0, it returns at most n matches. If n <= 0, it returns all matches. The optional 'results' slice can be provided for reuse (set to nil for fresh allocation).
Example:
re := coregex.MustCompile(`\d+`)
indices := re.FindAllIndexCompact([]byte("a1b2c3"), -1, nil)
// indices = [[1,2], [3,4], [5,6]]
func (*Regex) FindAllString ¶
FindAllString returns a slice of all successive matches of the pattern in s. If n > 0, it returns at most n matches. If n <= 0, it returns all matches.
Example:
re := coregex.MustCompile(`\d+`)
matches := re.FindAllString("1 2 3", -1)
// matches = ["1", "2", "3"]
Example ¶
ExampleRegex_FindAllString demonstrates finding all string matches.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re := coregex.MustCompile(`\w+`)
words := re.FindAllString("hello world test", -1)
for _, word := range words {
fmt.Print(word, " ")
}
fmt.Println()
}
Output: hello world test
func (*Regex) FindAllStringIndex ¶ added in v0.3.0
FindAllStringIndex returns a slice of all successive matches of the pattern in s, as index pairs [start, end]. If n > 0, it returns at most n matches. If n <= 0, it returns all matches.
Example:
re := coregex.MustCompile(`\d+`)
indices := re.FindAllStringIndex("1 2 3", -1)
// indices = [[0,1], [2,3], [4,5]]
func (*Regex) FindAllStringIndexCompact ¶ added in v0.10.6
FindAllStringIndexCompact returns all successive matches as a compact [][2]int slice. This is the string version of FindAllIndexCompact.
func (*Regex) FindAllStringSubmatch ¶ added in v0.4.0
FindAllStringSubmatch returns a slice of all successive matches of the pattern in s, where each match includes all capture groups as strings. If n > 0, returns at most n matches. If n <= 0, returns all matches.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
matches := re.FindAllStringSubmatch("[email protected] [email protected]", -1)
// len(matches) == 2
// matches[0][0] = "[email protected]"
// matches[0][1] = "a"
func (*Regex) FindAllStringSubmatchIndex ¶ added in v0.4.0
FindAllStringSubmatchIndex returns a slice of all successive matches of the pattern in s, where each match includes index pairs for all capture groups. If n > 0, returns at most n matches. If n <= 0, returns all matches.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
indices := re.FindAllStringSubmatchIndex("[email protected] [email protected]", -1)
func (*Regex) FindAllSubmatch ¶ added in v0.4.0
FindAllSubmatch returns a slice of all successive matches of the pattern in b, where each match includes all capture groups. If n > 0, returns at most n matches. If n <= 0, returns all matches.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
matches := re.FindAllSubmatch([]byte("[email protected] [email protected]"), -1)
// len(matches) == 2
// matches[0][0] = "[email protected]"
// matches[0][1] = "a"
func (*Regex) FindAllSubmatchIndex ¶ added in v0.4.0
FindAllSubmatchIndex returns a slice of all successive matches of the pattern in b, where each match includes index pairs for all capture groups. If n > 0, returns at most n matches. If n <= 0, returns all matches.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
indices := re.FindAllSubmatchIndex([]byte("[email protected] [email protected]"), -1)
// len(indices) == 2
// indices[0] contains start/end pairs for each group
func (*Regex) FindIndex ¶
FindIndex returns a two-element slice of integers defining the location of the leftmost match in b. The match is at b[loc[0]:loc[1]]. Returns nil if no match is found.
Example:
re := coregex.MustCompile(`\d+`)
loc := re.FindIndex([]byte("age: 42"))
println(loc[0], loc[1]) // 5, 7
Example ¶
ExampleRegex_FindIndex demonstrates finding match positions.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re := coregex.MustCompile(`\d+`)
loc := re.FindIndex([]byte("age: 42"))
fmt.Printf("Match at [%d:%d]\n", loc[0], loc[1])
}
Output: Match at [5:7]
func (*Regex) FindReaderIndex ¶ added in v0.10.7
func (r *Regex) FindReaderIndex(reader io.RuneReader) []int
FindReaderIndex returns a two-element slice of integers defining the location of the leftmost match of the regular expression in text read from the RuneReader. The match text was found in the input stream at byte offset loc[0] through loc[1]-1. A return value of nil indicates no match.
func (*Regex) FindReaderSubmatchIndex ¶ added in v0.10.7
func (r *Regex) FindReaderSubmatchIndex(reader io.RuneReader) []int
FindReaderSubmatchIndex returns a slice holding the index pairs identifying the leftmost match of the regular expression of text read by the RuneReader, and the matches, if any, of its subexpressions, as defined by the 'Submatch' and 'Index' descriptions in the package comment. A return value of nil indicates no match.
func (*Regex) FindString ¶
FindString returns a string holding the text of the leftmost match in s. Returns empty string if no match is found.
Example:
re := coregex.MustCompile(`\d+`)
match := re.FindString("age: 42")
println(match) // "42"
Example ¶
ExampleRegex_FindString demonstrates finding a match in a string.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re := coregex.MustCompile(`\w+@\w+\.\w+`)
email := re.FindString("Contact: [email protected]")
fmt.Println(email)
}
Output: [email protected]
func (*Regex) FindStringIndex ¶
FindStringIndex returns a two-element slice of integers defining the location of the leftmost match in s. The match is at s[loc[0]:loc[1]]. Returns nil if no match is found.
Example:
re := coregex.MustCompile(`\d+`)
loc := re.FindStringIndex("age: 42")
println(loc[0], loc[1]) // 5, 7
func (*Regex) FindStringSubmatch ¶ added in v0.2.0
FindStringSubmatch returns a slice of strings holding the text of the leftmost match and the matches of all capture groups.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
match := re.FindStringSubmatch("[email protected]")
// match[0] = "[email protected]"
// match[1] = "user"
Example ¶
ExampleRegex_FindStringSubmatch demonstrates capture groups with strings.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re := coregex.MustCompile(`(\d{4})-(\d{2})-(\d{2})`)
match := re.FindStringSubmatch("Date: 2024-12-25")
if match != nil {
fmt.Printf("Year: %s, Month: %s, Day: %s\n", match[1], match[2], match[3])
}
}
Output: Year: 2024, Month: 12, Day: 25
func (*Regex) FindStringSubmatchIndex ¶ added in v0.2.0
FindStringSubmatchIndex returns the index pairs for the leftmost match and capture groups. Same as FindSubmatchIndex but for strings.
func (*Regex) FindSubmatch ¶ added in v0.2.0
FindSubmatch returns a slice holding the text of the leftmost match and the matches of all capture groups.
A return value of nil indicates no match. Result[0] is the entire match, result[i] is the ith capture group. Unmatched groups will be nil.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
match := re.FindSubmatch([]byte("[email protected]"))
// match[0] = "[email protected]"
// match[1] = "user"
// match[2] = "example"
// match[3] = "com"
Example ¶
ExampleRegex_FindSubmatch demonstrates capture group extraction.
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
match := re.FindSubmatch([]byte("Contact: [email protected]"))
if match != nil {
fmt.Println("Full match:", string(match[0]))
fmt.Println("User:", string(match[1]))
fmt.Println("Domain:", string(match[2]))
fmt.Println("TLD:", string(match[3]))
}
}
Output: Full match: [email protected] User: user Domain: example TLD: com
func (*Regex) FindSubmatchIndex ¶ added in v0.2.0
FindSubmatchIndex returns a slice holding the index pairs for the leftmost match and the matches of all capture groups.
A return value of nil indicates no match. Result[2*i:2*i+2] is the indices for the ith group. Unmatched groups have -1 indices.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
idx := re.FindSubmatchIndex([]byte("[email protected]"))
// idx[0:2] = indices for entire match
// idx[2:4] = indices for first capture group
func (*Regex) LiteralPrefix ¶ added in v0.10.7
LiteralPrefix returns a literal string that must begin any match of the regular expression re. It returns the boolean true if the literal string comprises the entire regular expression.
Example:
re := coregex.MustCompile(`Hello, \w+`) prefix, complete := re.LiteralPrefix() // prefix = "Hello, ", complete = false re2 := coregex.MustCompile(`Hello`) prefix2, complete2 := re2.LiteralPrefix() // prefix2 = "Hello", complete2 = true
func (*Regex) Longest ¶ added in v0.8.2
func (r *Regex) Longest()
Longest makes future searches prefer the leftmost-longest match.
By default, coregex uses leftmost-first (Perl) semantics where the first alternative in an alternation wins. After calling Longest(), coregex uses leftmost-longest (POSIX) semantics where the longest match wins.
Example:
re := coregex.MustCompile(`(a|ab)`)
re.FindString("ab") // returns "a" (leftmost-first: first branch wins)
re.Longest()
re.FindString("ab") // returns "ab" (leftmost-longest: longest wins)
Note: Unlike stdlib, calling Longest() modifies the regex state and should not be called concurrently with search methods.
func (*Regex) MarshalText ¶ added in v0.10.7
MarshalText implements encoding.TextMarshaler. The output is the result of r.String().
func (*Regex) Match ¶
Match reports whether the byte slice b contains any match of the pattern.
Example:
re := coregex.MustCompile(`\d+`)
if re.Match([]byte("hello 123")) {
println("contains digits")
}
func (*Regex) MatchReader ¶ added in v0.10.7
func (r *Regex) MatchReader(reader io.RuneReader) bool
MatchReader reports whether the text returned by the RuneReader contains any match of the regular expression re.
func (*Regex) MatchString ¶
MatchString reports whether the string s contains any match of the pattern. This is a zero-allocation operation (like Rust's is_match).
Example:
re := coregex.MustCompile(`hello`)
if re.MatchString("hello world") {
println("matched!")
}
func (*Regex) NumSubexp ¶ added in v0.2.0
NumSubexp returns the number of parenthesized subexpressions in this Regex. This does NOT include group 0 (the entire match), matching stdlib regexp behavior.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`) println(re.NumSubexp()) // 3 (just the capture groups)
Example ¶
ExampleRegex_NumSubexp demonstrates counting capture groups. NumSubexp returns the number of parenthesized subexpressions, not counting the full match (group 0).
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
fmt.Println("Number of groups:", re.NumSubexp())
}
Output: Number of groups: 3
func (*Regex) ReplaceAll ¶ added in v0.3.0
ReplaceAll returns a copy of src, replacing matches of the pattern with the replacement bytes repl. Inside repl, $ signs are interpreted as in Regexp.Expand: $0 is the entire match, $1 is the first capture group, etc.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
result := re.ReplaceAll([]byte("[email protected]"), []byte("$1 at $2 dot $3"))
// result = []byte("user at example dot com")
func (*Regex) ReplaceAllFunc ¶ added in v0.3.0
ReplaceAllFunc returns a copy of src in which all matches of the pattern have been replaced by the return value of function repl applied to the matched byte slice. The replacement returned by repl is substituted directly, without using Expand.
Example:
re := coregex.MustCompile(`\d+`)
result := re.ReplaceAllFunc([]byte("1 2 3"), func(s []byte) []byte {
n, _ := strconv.Atoi(string(s))
return []byte(strconv.Itoa(n * 2))
})
// result = []byte("2 4 6")
func (*Regex) ReplaceAllLiteral ¶ added in v0.3.0
ReplaceAllLiteral returns a copy of src, replacing matches of the pattern with the replacement bytes repl. The replacement is substituted directly, without expanding $ variables.
Example:
re := coregex.MustCompile(`\d+`)
result := re.ReplaceAllLiteral([]byte("age: 42"), []byte("XX"))
// result = []byte("age: XX")
func (*Regex) ReplaceAllLiteralString ¶ added in v0.3.0
ReplaceAllLiteralString returns a copy of src, replacing matches of the pattern with the replacement string repl. The replacement is substituted directly, without expanding $ variables.
Example:
re := coregex.MustCompile(`\d+`)
result := re.ReplaceAllLiteralString("age: 42", "XX")
// result = "age: XX"
func (*Regex) ReplaceAllString ¶ added in v0.3.0
ReplaceAllString returns a copy of src, replacing matches of the pattern with the replacement string repl. Inside repl, $ signs are interpreted as in Regexp.Expand: $0 is the entire match, $1 is the first capture group, etc.
Example:
re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
result := re.ReplaceAllString("[email protected]", "$1 at $2 dot $3")
// result = "user at example dot com"
func (*Regex) ReplaceAllStringFunc ¶ added in v0.3.0
ReplaceAllStringFunc returns a copy of src in which all matches of the pattern have been replaced by the return value of function repl applied to the matched string. The replacement returned by repl is substituted directly, without using Expand.
Example:
re := coregex.MustCompile(`\d+`)
result := re.ReplaceAllStringFunc("1 2 3", func(s string) string {
n, _ := strconv.Atoi(s)
return strconv.Itoa(n * 2)
})
// result = "2 4 6"
func (*Regex) Split ¶ added in v0.3.0
Split slices s into substrings separated by the expression and returns a slice of the substrings between those expression matches.
The slice returned by this method consists of all the substrings of s not contained in the slice returned by FindAllString. When called on an expression that contains no metacharacters, it is equivalent to strings.SplitN.
The count determines the number of substrings to return:
n > 0: at most n substrings; the last substring will be the unsplit remainder. n == 0: the result is nil (zero substrings) n < 0: all substrings
Example:
re := coregex.MustCompile(`,`)
parts := re.Split("a,b,c", -1)
// parts = ["a", "b", "c"]
parts = re.Split("a,b,c", 2)
// parts = ["a", "b,c"]
func (*Regex) String ¶
String returns the source text used to compile the regular expression.
Example:
re := coregex.MustCompile(`\d+`) println(re.String()) // `\d+`
func (*Regex) SubexpIndex ¶ added in v0.10.7
SubexpIndex returns the index of the first subexpression with the given name, or -1 if there is no subexpression with that name.
Note that multiple subexpressions can be written using the same name, as in (?P<bob>a+)(?P<bob>b+), which declares two subexpressions named "bob". In this case, SubexpIndex returns the index of the leftmost such subexpression in the regular expression.
Example:
re := coregex.MustCompile(`(?P<year>\d+)-(?P<month>\d+)`)
re.SubexpIndex("year") // returns 1
re.SubexpIndex("month") // returns 2
re.SubexpIndex("day") // returns -1
func (*Regex) SubexpNames ¶ added in v0.5.0
SubexpNames returns the names of the parenthesized subexpressions in this Regex. The name for the first sub-expression is names[1], so that if m is a match slice, the name for m[i] is SubexpNames()[i]. Since the Regexp as a whole cannot be named, names[0] is always the empty string. The slice returned is shared and must not be modified.
Example:
re := coregex.MustCompile(`(?P<year>\d+)-(?P<month>\d+)`) names := re.SubexpNames() // names[0] = "" // names[1] = "year" // names[2] = "month"
Example ¶
ExampleRegex_SubexpNames demonstrates named capture groups
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
// Pattern with named and unnamed captures
re := coregex.MustCompile(`(?P<year>\d{4})-(?P<month>\d{2})-(\d{2})`)
// Get capture group names
// Note: SubexpNames includes group 0 (empty string for full match)
// but NumSubexp only counts parenthesized subexpressions (groups 1-3)
names := re.SubexpNames()
fmt.Printf("Capture groups: %d\n", re.NumSubexp())
fmt.Printf("Group 0 (full match): %q\n", names[0])
fmt.Printf("Group 1 (year): %q\n", names[1])
fmt.Printf("Group 2 (month): %q\n", names[2])
fmt.Printf("Group 3 (day, unnamed): %q\n", names[3])
}
Output: Capture groups: 3 Group 0 (full match): "" Group 1 (year): "year" Group 2 (month): "month" Group 3 (day, unnamed): ""
Example (Matching) ¶
ExampleRegex_SubexpNames_matching shows using SubexpNames with matches
package main
import (
"fmt"
"github.com/coregx/coregex"
)
func main() {
// Compile pattern with named captures
re := coregex.MustCompile(`(?P<protocol>https?)://(?P<domain>\w+)`)
// Find match and get submatch values
match := re.FindStringSubmatch("Visit https://example for more")
names := re.SubexpNames()
// Print matches with their names
for i, name := range names {
if i < len(match) && match[i] != "" {
if name != "" {
fmt.Printf("%s: %s\n", name, match[i])
} else if i == 0 {
fmt.Printf("Full match: %s\n", match[i])
}
}
}
}
Output: Full match: https://example protocol: https domain: example
func (*Regex) UnmarshalText ¶ added in v0.10.7
UnmarshalText implements encoding.TextUnmarshaler by calling Compile on the encoded value.
type Regexp ¶ added in v0.8.1
type Regexp = Regex
Regexp is an alias for Regex to provide drop-in compatibility with stdlib regexp. This allows replacing `import "regexp"` with `import regexp "github.com/coregx/coregex"` without changing type names in existing code.
Example:
import regexp "github.com/coregx/coregex" var re *regexp.Regexp = regexp.MustCompile(`\d+`)
Directories
¶
| Path | Synopsis |
|---|---|
|
dfa
|
|
|
lazy
Package lazy implements a Lazy DFA (Deterministic Finite Automaton) engine for regex matching.
|
Package lazy implements a Lazy DFA (Deterministic Finite Automaton) engine for regex matching. |
|
onepass
Package onepass implements a one-pass DFA for regex patterns that have no ambiguity in their matching paths.
|
Package onepass implements a one-pass DFA for regex patterns that have no ambiguity in their matching paths. |
|
internal
|
|
|
conv
Package conv provides safe integer conversion helpers for the regex engine.
|
Package conv provides safe integer conversion helpers for the regex engine. |
|
sparse
Package sparse provides a sparse set data structure for efficient state tracking.
|
Package sparse provides a sparse set data structure for efficient state tracking. |
|
Package literal provides types and operations for extracting literal sequences from regex patterns for prefilter optimization.
|
Package literal provides types and operations for extracting literal sequences from regex patterns for prefilter optimization. |
|
Package meta implements the meta-engine orchestrator that automatically selects the optimal regex execution strategy.
|
Package meta implements the meta-engine orchestrator that automatically selects the optimal regex execution strategy. |
|
Package nfa provides a Thompson NFA (Non-deterministic Finite Automaton) implementation for regex matching.
|
Package nfa provides a Thompson NFA (Non-deterministic Finite Automaton) implementation for regex matching. |
|
Package prefilter provides fast candidate filtering for regex search using extracted literal sequences.
|
Package prefilter provides fast candidate filtering for regex search using extracted literal sequences. |
|
Package simd provides SIMD-accelerated string operations for high-performance byte searching.
|
Package simd provides SIMD-accelerated string operations for high-performance byte searching. |