Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need help with regex for search & replace 1

Status
Not open for further replies.

mopacfan

Programmer
Oct 30, 2000
190
US
I am not very versed in regular expressions and I'm having a hard time creating an expression that will run through thousands of text files and change every occurrence of a string such as "~24~56.htm" where the numbers are varied so it will strip out the first tilde and the first number, leaving the second tilde, the second number and .htm extension.

Any help will be greatly appreciated.
 
This seems to work as the Pattern you would need:

[tt]"~\d{1,}(?=~\d{1,})"[/tt]

Simply use RegEx.Replace, with this pattern, your input string and a replace value of "" (an empty string)

The pattern says:
Return text consisting of a tilde followed by one or more numbers if that text is followed by a tilde followed by one or more numbers.

Hope this helps.

[vampire][bat]
 
Cool, that matches the string. So where does the "" go? I need to remove the first tilde and number pattern. Do I need to assign the second pattern match to a variable $1 so it can be used as the replacement string?
 
n/m, I figured it out. I just use $1 as the replacement. Much appreciated. :)
 
Try this:

Code:
	Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

		Dim s As String = "~24~56.htm"
		Dim pattern As String = "~\d{1,}(?=~\d{1,})"

		Dim rx As New Regex(pattern)
		MessageBox.Show(rx.Replace(s, ""))

	End Sub

By the way, a slightly more comprehensive pattern (should you need it) might be:

[tt](~\d{1,}){1,}(?=~\d{1,}.htm)[/tt]

which enforeces the requirement for the string to end with .htm, and also allows for more than one set of tilde/numbers to discard.


Hope this helps.

[vampire][bat]
 
Just a slight tweak to earthandfire's solution. One or more occurences in a pattern can be shown one of 2 ways, either {1,} as shown or + as in
Code:
Dim pattern As String = "~\d+(?=~\d+)"
If you need a specific number or range of numbers than you can use something like {2} or {3,5}.

For most situations you can use one of the following:
* - zero or more times
? - zero or one time
+ - one or more times

Be aware that a ? after any of the above means lazy matching instead of greedy matching. Good Luck!
 
I have regexbuddy and it is very helpful to be able to see visually what each expression does. But as a newbey to regex, just getting up to speed is a bit challenging.

I'd like to thank each of you for your assistance. I'm starting to get the hang of this.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top