C# Programming Guide - Strings 프로그래밍/C#2017. 5. 10. 17:02
- Strings
- How to: Concatenate Multiple Strings
- How to: Modify String Contents
- How to: Compare Strings
- How to: Parse Strings Using String.Split
- How to: Search Strings Using String Methods
- How to: Search Strings Using Regular Expressions
- How to: Determine Whether a String Represents a Numeric Value
- How to: Convert a String to a DateTime
- How to: Convert Between Legacy Encodings and Unicode
- How to: Convert RTF to Plain Text
https://docs.microsoft.com/en-us/dotnet/articles/csharp/programming-guide/strings/index
Strings
A string is an object of type whose value is text. Internally, the text is stored as a sequential read-only collection of objects. There is no null-terminating character at the end of a C# string; therefore a C# string can contain any number of embedded null characters ('\0'). The property of a string represents the number of Char
objects it contains, not the number of Unicode characters. To access the individual Unicode code points in a string, use the object.
string vs. System.String
In C#, the string
keyword is an alias for . Therefore, String
and string
are equivalent, and you can use whichever naming convention you prefer. The String
class provides many methods for safely creating, manipulating, and comparing strings. In addition, the C# language overloads some operators to simplify common string operations. For more information about the keyword, see . For more information about the type and its methods, see .
Declaring and Initializing Strings
You can declare and initialize strings in various ways, as shown in the following example:
// Declare without initializing.
string message1;
// Initialize to null.
string message2 = null;
// Initialize as an empty string.
// Use the Empty constant instead of the literal "".
string message3 = System.String.Empty;
//Initialize with a regular string literal.
string oldPath = "c:\\Program Files\\Microsoft Visual Studio 8.0";
// Initialize with a verbatim string literal.
string newPath = @"c:\Program Files\Microsoft Visual Studio 9.0";
// Use System.String if you prefer.
System.String greeting = "Hello World!";
// In local variables (i.e. within a method body)
// you can use implicit typing.
var temp = "I'm still a strongly-typed System.String!";
// Use a const string to prevent 'message4' from
// being used to store another string value.
const string message4 = "You can't get rid of me!";
// Use the String constructor only when creating
// a string from a char*, char[], or sbyte*. See
// System.String documentation for details.
char[] letters = { 'A', 'B', 'C' };
string alphabet = new string(letters);
Note that you do not use the
operator to create a string object except when initializing the string with an array of chars.Initialize a string with the
constant value to create a new object whose string is of zero length. The string literal representation of a zero-length string is "". By initializing strings with the value instead of , you can reduce the chances of a occurring. Use the static method to verify the value of a string before you try to access it.Immutability of String Objects
String objects are immutable: they cannot be changed after they have been created. All of the s1
and s2
are concatenated to form a single string, the two original strings are unmodified. The +=
operator creates a new string that contains the combined contents. That new object is assigned to the variable s1
, and the original object that was assigned to s1
is released for garbage collection because no other variable holds a reference to it.
string s1 = "A string is more ";
string s2 = "than the sum of its chars.";
// Concatenate s1 and s2. This actually creates a new
// string object and stores it in s1, releasing the
// reference to the original object.
s1 += s2;
System.Console.WriteLine(s1);
// Output: A string is more than the sum of its chars.
Because a string "modification" is actually a new string creation, you must use caution when you create references to strings. If you create a reference to a string, and then "modify" the original string, the reference will continue to point to the original object instead of the new object that was created when the string was modified. The following code illustrates this behavior:
string s1 = "Hello ";
string s2 = s1;
s1 += "World";
System.Console.WriteLine(s2);
//Output: Hello
For more information about how to create new strings that are based on modifications such as search and replace operations on the original string, see
.Regular and Verbatim String Literals
Use regular string literals when you must embed escape characters provided by C#, as shown in the following example:
string columns = "Column 1\tColumn 2\tColumn 3";
//Output: Column 1 Column 2 Column 3
string rows = "Row 1\r\nRow 2\r\nRow 3";
/* Output:
Row 1
Row 2
Row 3
*/
string title = "\"The \u00C6olean Harp\", by Samuel Taylor Coleridge";
//Output: "The Æolean Harp", by Samuel Taylor Coleridge
Use verbatim strings for convenience and better readability when the string text contains backslash characters, for example in file paths. Because verbatim strings preserve new line characters as part of the string text, they can be used to initialize multiline strings. Use double quotation marks to embed a quotation mark inside a verbatim string. The following example shows some common uses for verbatim strings:
string filePath = @"C:\Users\scoleridge\Documents\";
//Output: C:\Users\scoleridge\Documents\
string text = @"My pensive SARA ! thy soft cheek reclined
Thus on mine arm, most soothing sweet it is
To sit beside our Cot,...";
/* Output:
My pensive SARA ! thy soft cheek reclined
Thus on mine arm, most soothing sweet it is
To sit beside our Cot,...
*/
string quote = @"Her name was ""Sara.""";
//Output: Her name was "Sara."
String Escape Sequences
Escape sequence | Character name | Unicode encoding |
---|---|---|
\' | Single quote | 0x0027 |
\" | Double quote | 0x0022 |
\|Backslash | 0x005C | |
\0 | Null | 0x0000 |
\a | Alert | 0x0007 |
\b | Backspace | 0x0008 |
\f | Form feed | 0x000C |
\n | New line | 0x000A |
\r | Carriage return | 0x000D |
\t | Horizontal tab | 0x0009 |
\U | Unicode escape sequence for surrogate pairs. | \Unnnnnnnn |
\u | Unicode escape sequence | \u0041 = "A" |
\v | Vertical tab | 0x000B |
\x | Unicode escape sequence similar to "\u" except with variable length. | \x0041 = "A" |
Note
At compile time, verbatim strings are converted to ordinary strings with all the same escape sequences. Therefore, if you view a verbatim string in the debugger watch window, you will see the escape characters that were added by the compiler, not the verbatim version from your source code. For example, the verbatim string @"C:\files.txt" will appear in the watch window as "C:\\files.txt".
Format Strings
A format string is a string whose contents can be determined dynamically at runtime. You create a format string by using the static
method and embedding placeholders in braces that will be replaced by other values at runtime. The following example uses a format string to output the result of each iteration of a loop:class FormatString
{
static void Main()
{
// Get user input.
System.Console.WriteLine("Enter a number");
string input = System.Console.ReadLine();
// Convert the input string to an int.
int j;
System.Int32.TryParse(input, out j);
// Write a different string each iteration.
string s;
for (int i = 0; i < 10; i++)
{
// A simple format string with no alignment formatting.
s = System.String.Format("{0} times {1} = {2}", i, j, (i * j));
System.Console.WriteLine(s);
}
//Keep the console window open in debug mode.
System.Console.ReadKey();
}
}
One overload of the Output window, you have to explicitly call the method because only accepts a string, not a format string. For more information about format strings, see .
method takes a format string as a parameter. Therefore, you can just embed a format string literal without an explicit call to the method. However, if you use the method to display debug output in the Visual StudioSubstrings
A substring is any sequence of characters that is contained in a string. Use the
method to create a new string from a part of the original string. You can search for one or more occurrences of a substring by using the method. Use the method to replace all occurrences of a specified substring with a new string. Like the method, actually returns a new string and does not modify the original string. For more information, see and .string s3 = "Visual C# Express";
System.Console.WriteLine(s3.Substring(7, 2));
// Output: "C#"
System.Console.WriteLine(s3.Replace("C#", "Basic"));
// Output: "Visual Basic Express"
// Index values are zero-based
int index = s3.IndexOf("C");
// index = 7
Accessing Individual Characters
You can use array notation with an index value to acquire read-only access to individual characters, as in the following example:
string s5 = "Printing backwards";
for (int i = 0; i < s5.Length; i++)
{
System.Console.Write(s5[s5.Length - i - 1]);
}
// Output: "sdrawkcab gnitnirP"
If the
methods do not provide the functionality that you must have to modify individual characters in a string, you can use a object to modify the individual chars "in-place", and then create a new string to store the results by using the methods. In the following example, assume that you must modify the original string in a particular way and then store the results for future use:string question = "hOW DOES mICROSOFT wORD DEAL WITH THE cAPS lOCK KEY?";
System.Text.StringBuilder sb = new System.Text.StringBuilder(question);
for (int j = 0; j < sb.Length; j++)
{
if (System.Char.IsLower(sb[j]) == true)
sb[j] = System.Char.ToUpper(sb[j]);
else if (System.Char.IsUpper(sb[j]) == true)
sb[j] = System.Char.ToLower(sb[j]);
}
// Store the new string.
string corrected = sb.ToString();
System.Console.WriteLine(corrected);
// Output: How does Microsoft Word deal with the Caps Lock key?
Null Strings and Empty Strings
An empty string is an instance of a
object that contains zero characters. Empty strings are used often in various programming scenarios to represent a blank text field. You can call methods on empty strings because they are valid objects. Empty strings are initialized as follows:string s = String.Empty;
By contrast, a null string does not refer to an instance of a
object and any attempt to call a method on a null string causes a . However, you can use null strings in concatenation and comparison operations with other strings. The following examples illustrate some cases in which a reference to a null string does and does not cause an exception to be thrown:static void Main()
{
string str = "hello";
string nullStr = null;
string emptyStr = String.Empty;
string tempStr = str + nullStr;
// Output of the following line: hello
Console.WriteLine(tempStr);
bool b = (emptyStr == nullStr);
// Output of the following line: False
Console.WriteLine(b);
// The following line creates a new empty string.
string newStr = emptyStr + nullStr;
// Null strings and empty strings behave differently. The following
// two lines display 0.
Console.WriteLine(emptyStr.Length);
Console.WriteLine(newStr.Length);
// The following line raises a NullReferenceException.
//Console.WriteLine(nullStr.Length);
// The null character can be displayed and counted, like other chars.
string s1 = "\x0" + "abc";
string s2 = "abc" + "\x0";
// Output of the following line: * abc*
Console.WriteLine("*" + s1 + "*");
// Output of the following line: *abc *
Console.WriteLine("*" + s2 + "*");
// Output of the following line: 4
Console.WriteLine(s2.Length);
}
Using StringBuilder for Fast String Creation
String operations in .NET are highly optimized and in most cases do not significantly impact performance. However, in some scenarios such as tight loops that are executing many hundreds or thousands of times, string operations can affect performance. The
class creates a string buffer that offers better performance if your program performs many string manipulations. The string also enables you to reassign individual characters, something the built-in string data type does not support. This code, for example, changes the content of a string without creating a new string:System.Text.StringBuilder sb = new System.Text.StringBuilder("Rat: the ideal pet");
sb[0] = 'C';
System.Console.WriteLine(sb.ToString());
System.Console.ReadLine();
//Outputs Cat: the ideal pet
In this example, a
object is used to create a string from a set of numeric types:class TestStringBuilder
{
static void Main()
{
System.Text.StringBuilder sb = new System.Text.StringBuilder();
// Create a string composed of numbers 0 - 9
for (int i = 0; i < 10; i++)
{
sb.Append(i.ToString());
}
System.Console.WriteLine(sb); // displays 0123456789
// Copy one character of the string (not possible with a System.String)
sb[0] = sb[9];
System.Console.WriteLine(sb); // displays 9123456789
}
}
Strings, Extension Methods and LINQ
Because the
type implements , you can use the extension methods defined in the class on strings. To avoid visual clutter, these methods are excluded from IntelliSense for the type, but they are available nevertheless. You can also use LINQ query expressions on strings. For more information, see .Related Topics
Topic | Description |
---|---|
Provides a code example that illustrates how to modify the contents of strings. | |
Illustrates how to use the + operator and the Stringbuilder class to join strings together at compile time and run time. | |
Shows how to perform ordinal comparisons of strings. | |
Contains a code example that illustrates how to use the String.Split method to parse strings. | |
Explains how to use specific methods to search strings. | |
Explains how to use regular expressions to search strings. | |
Shows how to safely parse a string to see whether it has a valid numeric value. | |
Shows how to convert a string such as "01/24/2008" to a | object.|
Provides links to topics that use | and methods to perform basic string operations.|
Describes how to insert characters or empty spaces into a string. | |
Includes information about how to compare strings and provides examples in C# and Visual Basic. | |
Describes how to create and modify dynamic string objects by using the | class.|
Provides information about how to perform various string operations by using LINQ queries. | |
Provides links to topics that explain programming constructs in C#. |
How to: Concatenate Multiple Strings
Concatenation is the process of appending one string to the end of another string. When you concatenate string literals or string constants by using the +
operator, the compiler creates a single string. No run time concatenation occurs. However, string variables can be concatenated only at run time. In this case, you should understand the performance implications of the various approaches.
Example
The following example shows how to split a long string literal into smaller strings in order to improve readability in the source code. These parts will be concatenated into a single string at compile time. There is no run time performance cost regardless of the number of strings involved.
static void Main()
{
// Concatenation of literals is performed at compile time, not run time.
string text = "Historically, the world of data and the world of objects " +
"have not been well integrated. Programmers work in C# or Visual Basic " +
"and also in SQL or XQuery. On the one side are concepts such as classes, " +
"objects, fields, inheritance, and .NET Framework APIs. On the other side " +
"are tables, columns, rows, nodes, and separate languages for dealing with " +
"them. Data types often require translation between the two worlds; there are " +
"different standard functions. Because the object world has no notion of query, a " +
"query can only be represented as a string without compile-time type checking or " +
"IntelliSense support in the IDE. Transferring data from SQL tables or XML trees to " +
"objects in memory is often tedious and error-prone.";
Console.WriteLine(text);
}
Example
To concatenate string variables, you can use the +
or +=
operators, or the , or methods. The +
operator is easy to use and makes for intuitive code. Even if you use several + operators in one statement, the string content is copied only once. But if you repeat this operation multiple times, for example in a loop, it might cause efficiency problems. For example, note the following code:
static void Main(string[] args)
{
// To run this program, provide a command line string.
// In Visual Studio, see Project > Properties > Debug.
string userName = args[0];
string date = DateTime.Today.ToShortDateString();
// Use the + and += operators for one-time concatenations.
string str = "Hello " + userName + ". Today is " + date + ".";
System.Console.WriteLine(str);
str += " How are you today?";
System.Console.WriteLine(str);
// Keep the console window open in debug mode.
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
// Example output:
// Hello Alexander. Today is 1/22/2008.
// Hello Alexander. Today is 1/22/2008. How are you today?
// Press any key to exit.
//
Note
In string concatenation operations, the C# compiler treats a null string the same as an empty string, but it does not convert the value of the original null string.
If you are not concatenating large numbers of strings (for example, in a loop), the performance cost of this code is probably not significant. The same is true for the
and methods.However, when performance is important, you should always use the +
operator.
class StringBuilderTest
{
static void Main()
{
string text = null;
// Use StringBuilder for concatenation in tight loops.
System.Text.StringBuilder sb = new System.Text.StringBuilder();
for (int i = 0; i < 100; i++)
{
sb.AppendLine(i.ToString());
}
System.Console.WriteLine(sb.ToString());
// Keep the console window open in debug mode.
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
// Output:
// 0
// 1
// 2
// 3
// 4
// ...
//
How to: Modify String Contents
Because strings are immutable, it is not possible (without using unsafe code) to modify the value of a string object after it has been created. However, there are many ways to modify the value of a string and store the result in a new string object. The class provides methods that operate on an input string and return a new string object. In many cases, you can assign the new object to the variable that held the original string. The class provides additional methods that work in a similar manner. The class provides a character buffer that you can modify "in-place." You call the method to create a new string object that contains the current contents of the buffer.
Example
The following example shows various ways to replace or remove substrings in a specified string.
class ReplaceSubstrings
{
string searchFor;
string replaceWith;
static void Main(string[] args)
{
ReplaceSubstrings app = new ReplaceSubstrings();
string s = "The mountains are behind the clouds today.";
// Replace one substring with another with String.Replace.
// Only exact matches are supported.
s = s.Replace("mountains", "peaks");
Console.WriteLine(s);
// Output: The peaks are behind the clouds today.
// Use Regex.Replace for more flexibility.
// Replace "the" or "The" with "many" or "Many".
// using System.Text.RegularExpressions
app.searchFor = "the"; // A very simple regular expression.
app.replaceWith = "many";
s = Regex.Replace(s, app.searchFor, app.ReplaceMatchCase, RegexOptions.IgnoreCase);
Console.WriteLine(s);
// Output: Many peaks are behind many clouds today.
// Replace all occurrences of one char with another.
s = s.Replace(' ', '_');
Console.WriteLine(s);
// Output: Many_peaks_are_behind_many_clouds_today.
// Remove a substring from the middle of the string.
string temp = "many_";
int i = s.IndexOf(temp);
if (i >= 0)
{
s = s.Remove(i, temp.Length);
}
Console.WriteLine(s);
// Output: Many_peaks_are_behind_clouds_today.
// Remove trailing and leading whitespace.
// See also the TrimStart and TrimEnd methods.
string s2 = " I'm wider than I need to be. ";
// Store the results in a new string variable.
temp = s2.Trim();
Console.WriteLine(temp);
// Output: I'm wider than I need to be.
// Keep the console window open in debug mode.
Console.WriteLine("Press any key to exit");
Console.ReadKey();
}
// Custom match method called by Regex.Replace
// using System.Text.RegularExpressions
string ReplaceMatchCase(Match m)
{
// Test whether the match is capitalized
if (Char.IsUpper(m.Value[0]) == true)
{
// Capitalize the replacement string
// using System.Text;
StringBuilder sb = new StringBuilder(replaceWith);
sb[0] = (Char.ToUpper(sb[0]));
return sb.ToString();
}
else
{
return replaceWith;
}
}
}
Example
To access the individual characters in a string by using array notation, you can use the []
operator to provide access to its internal character buffer. You can also convert the string to an array of chars by using the method. The following example uses ToCharArray
to create the array. Some elements of this array are then modified. A string constructor that takes a char array as an input parameter is then called to create a new string.
class ModifyStrings
{
static void Main()
{
string str = "The quick brown fox jumped over the fence";
System.Console.WriteLine(str);
char[] chars = str.ToCharArray();
int animalIndex = str.IndexOf("fox");
if (animalIndex != -1)
{
chars[animalIndex++] = 'c';
chars[animalIndex++] = 'a';
chars[animalIndex] = 't';
}
string str2 = new string(chars);
System.Console.WriteLine(str2);
// Keep the console window open in debug mode
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
/* Output:
The quick brown fox jumped over the fence
The quick brown cat jumped over the fence
*/
Example
The following example is provided for those very rare situations in which you may want to modify a string in-place by using unsafe code in a manner similar to C-style char arrays. The example shows how to access the individual characters "in-place" by using the fixed keyword. It also demonstrates one possible side effect of unsafe operations on strings that results from the way that the C# compiler stores (interns) strings internally. In general, you should not use this technique unless it is absolutely necessary.
class UnsafeString
{
unsafe static void Main(string[] args)
{
// Compiler will store (intern)
// these strings in same location.
string s1 = "Hello";
string s2 = "Hello";
// Change one string using unsafe code.
fixed (char* p = s1)
{
p[0] = 'C';
}
// Both strings have changed.
Console.WriteLine(s1);
Console.WriteLine(s2);
// Keep console window open in debug mode.
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
}
How to: Compare Strings
When you compare strings, you are producing a result that says one string is greater than or less than the other, or that the two strings are equal. The rules by which the result is determined are different depending on whether you are performing ordinal comparison or culture-sensitive comparison. It is important to use the correct kind of comparison for the specific task.
Use basic ordinal comparisons when you have to compare or sort the values of two strings without regard to linguistic conventions. A basic ordinal comparison (System.StringComparison.Ordinal
) is case-sensitive, which means that the two strings must match character for character: "and" does not equal "And" or "AND". A frequently-used variation is System.StringComparison.OrdinalIgnoreCase
, which will match "and", "And", and "AND". StringComparison.OrdinalIgnoreCase
is often used to compare file names, path names, network paths, and any other string whose value does not change based on the locale of the user's computer. For more information, see .
Culture-sensitive comparisons are typically used to compare and sort strings that are input by end users, because the characters and sorting conventions of these strings might vary depending on the locale of the user's computer. Even strings that contain identical characters might sort differently depending on the culture of the current thread.
Note
When you compare strings, you should use the methods that explicitly specify what kind of comparison you intend to perform. This makes your code much more maintainable and readable. Whenever possible, use the overloads of the methods of the ==
and !=
operators when you compare strings. Also, avoid using the instance methods because none of the overloads takes a .
Example
The following example shows how to correctly compare strings whose values will not change based on the locale of the user's computer. In addition, it also demonstrates the string interning feature of C#. When a program declares two or more identical string variables, the compiler stores them all in the same location. By calling the
method, you can see that the two strings actually refer to the same object in memory. Use the method to avoid interning, as shown in the example.
// Internal strings that will never be localized.
string root = @"C:\users";
string root2 = @"C:\Users";
// Use the overload of the Equals method that specifies a StringComparison.
// Ordinal is the fastest way to compare two strings.
bool result = root.Equals(root2, StringComparison.Ordinal);
Console.WriteLine("Ordinal comparison: {0} and {1} are {2}", root, root2,
result ? "equal." : "not equal.");
// To ignore case means "user" equals "User". This is the same as using
// String.ToUpperInvariant on each string and then performing an ordinal comparison.
result = root.Equals(root2, StringComparison.OrdinalIgnoreCase);
Console.WriteLine("Ordinal ignore case: {0} and {1} are {2}", root, root2,
result ? "equal." : "not equal.");
// A static method is also available.
bool areEqual = String.Equals(root, root2, StringComparison.Ordinal);
// String interning. Are these really two distinct objects?
string a = "The computer ate my source code.";
string b = "The computer ate my source code.";
// ReferenceEquals returns true if both objects
// point to the same location in memory.
if (String.ReferenceEquals(a, b))
Console.WriteLine("a and b are interned.");
else
Console.WriteLine("a and b are not interned.");
// Use String.Copy method to avoid interning.
string c = String.Copy(a);
if (String.ReferenceEquals(a, c))
Console.WriteLine("a and c are interned.");
else
Console.WriteLine("a and c are not interned.");
// Output:
// Ordinal comparison: C:\users and C:\Users are not equal.
// Ordinal ignore case: C:\users and C:\Users are equal.
// a and b are interned.
// a and c are not interned.
Example
The following example shows how to compare strings the preferred way by using the
methods that take a enumeration. Note that the instance methods are not used here, because none of the overloads takes a .// "They dance in the street."
// Linguistically (in Windows), "ss" is equal to
// the German essetz: 'ß' character in both en-US and de-DE cultures.
string first = "Sie tanzen in die Straße.";
string second = "Sie tanzen in die Strasse.";
Console.WriteLine("First sentence is {0}", first);
Console.WriteLine("Second sentence is {0}", second);
// Store CultureInfo for the current culture. Note that the original culture
// can be set and retrieved on the current thread object.
System.Threading.Thread thread = System.Threading.Thread.CurrentThread;
System.Globalization.CultureInfo originalCulture = thread.CurrentCulture;
// Set the culture to en-US.
thread.CurrentCulture = new System.Globalization.CultureInfo("en-US");
// For culture-sensitive comparisons, use the String.Compare
// overload that takes a StringComparison value.
int i = String.Compare(first, second, StringComparison.CurrentCulture);
Console.WriteLine("Comparing in {0} returns {1}.", originalCulture.Name, i);
// Change the current culture to Deutch-Deutchland.
thread.CurrentCulture = new System.Globalization.CultureInfo("de-DE");
i = String.Compare(first, second, StringComparison.CurrentCulture);
Console.WriteLine("Comparing in {0} returns {1}.", thread.CurrentCulture.Name, i);
// For culture-sensitive string equality, use either StringCompare as above
// or the String.Equals overload that takes a StringComparison value.
thread.CurrentCulture = originalCulture;
bool b = String.Equals(first, second, StringComparison.CurrentCulture);
Console.WriteLine("The two strings {0} equal.", b == true ? "are" : "are not");
/*
* Output:
First sentence is Sie tanzen in die Straße.
Second sentence is Sie tanzen in die Strasse.
Comparing in en-US returns 0.
Comparing in de-DE returns 0.
The two strings are equal.
*/
Example
The following example shows how to sort and search for strings in an array in a culture-sensitive manner by using the static
methods that take a parameter.class SortStringArrays
{
static void Main()
{
string[] lines = new string[]
{
@"c:\public\textfile.txt",
@"c:\public\textFile.TXT",
@"c:\public\Text.txt",
@"c:\public\testfile2.txt"
};
Console.WriteLine("Non-sorted order:");
foreach (string s in lines)
{
Console.WriteLine(" {0}", s);
}
Console.WriteLine("\n\rSorted order:");
// Specify Ordinal to demonstrate the different behavior.
Array.Sort(lines, StringComparer.Ordinal);
foreach (string s in lines)
{
Console.WriteLine(" {0}", s);
}
string searchString = @"c:\public\TEXTFILE.TXT";
Console.WriteLine("Binary search for {0}", searchString);
int result = Array.BinarySearch(lines, searchString, StringComparer.OrdinalIgnoreCase);
ShowWhere<string>(lines, result);
//Console.WriteLine("{0} {1}", result > 0 ? "Found" : "Did not find", searchString);
// Keep the console window open in debug mode.
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
// Displays where the string was found, or, if not found,
// where it would have been located.
private static void ShowWhere<T>(T[] array, int index)
{
if (index < 0)
{
// If the index is negative, it represents the bitwise
// complement of the next larger element in the array.
index = ~index;
Console.Write("Not found. Sorts between: ");
if (index == 0)
Console.Write("beginning of array and ");
else
Console.Write("{0} and ", array[index - 1]);
if (index == array.Length)
Console.WriteLine("end of array.");
else
Console.WriteLine("{0}.", array[index]);
}
else
{
Console.WriteLine("Found at index {0}.", index);
}
}
}
/*
* Output:
Non-sorted order:
c:\public\textfile.txt
c:\public\textFile.TXT
c:\public\Text.txt
c:\public\testfile2.txt
Sorted order:
c:\public\Text.txt
c:\public\testfile2.txt
c:\public\textFile.TXT
c:\public\textfile.txt
Binary search for c:\public\TEXTFILE.TXT
Found at index 2.
*/
Collection classes such as string
. In general, you should use these constructors whenever possible, and specify either Ordinal
or OrdinalIgnoreCase
.
How to: Parse Strings Using String.Split
The following code example demonstrates how a string can be parsed using the method. As input, takes an array of characters that indicate which characters separate interesting sub strings of the target string. The function returns an array of the sub strings.
This example uses spaces, commas, periods, colons, and tabs, all passed in an array containing these separating characters to
. Each word in the target string's sentence displays separately from the resulting array of strings.Example
class TestStringSplit
{
static void Main()
{
char[] delimiterChars = { ' ', ',', '.', ':', '\t' };
string text = "one\ttwo three:four,five six seven";
System.Console.WriteLine("Original text: '{0}'", text);
string[] words = text.Split(delimiterChars);
System.Console.WriteLine("{0} words in text:", words.Length);
foreach (string s in words)
{
System.Console.WriteLine(s);
}
// Keep the console window open in debug mode.
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
/* Output:
Original text: 'one two three:four,five six seven'
7 words in text:
one
two
three
four
five
six
seven
*/
Example
By default, String.Split returns empty strings when two separating characters appear contiguously in the target string. You can pass an optional StringSplitOptions.RemoveEmptyEntries parameter to exclude any empty strings in the output.
String.Split can take an array of strings (character sequences that act as separators for parsing the target string, instead of single characters).
class TestStringSplit
{
static void Main()
{
char[] separatingChars = { "<<", "..." };
string text = "one<<two......three<four";
System.Console.WriteLine("Original text: '{0}'", text);
string[] words = text.Split(separatingChars, System.StringSplitOptions.RemoveEmptyEntries );
System.Console.WriteLine("{0} substrings in text:", words.Length);
foreach (string s in words)
{
System.Console.WriteLine(s);
}
// Keep the console window open in debug mode.
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
/* Output:
Original text: 'one<<two......three<four'
3 words in text:
one
two
three<four
*/
How to: Search Strings Using String Methods
The
type, which is an alias for the class, provides a number of useful methods for searching the contents of a string.Example
The following example uses the
, , , and methods to search the strings.class StringSearch
{
static void Main()
{
string str = "Extension methods have all the capabilities of regular static methods.";
// Write the string and include the quotation marks.
System.Console.WriteLine("\"{0}\"", str);
// Simple comparisons are always case sensitive!
bool test1 = str.StartsWith("extension");
System.Console.WriteLine("Starts with \"extension\"? {0}", test1);
// For user input and strings that will be displayed to the end user,
// use the StringComparison parameter on methods that have it to specify how to match strings.
bool test2 = str.StartsWith("extension", System.StringComparison.CurrentCultureIgnoreCase);
System.Console.WriteLine("Starts with \"extension\"? {0} (ignoring case)", test2);
bool test3 = str.EndsWith(".", System.StringComparison.CurrentCultureIgnoreCase);
System.Console.WriteLine("Ends with '.'? {0}", test3);
// This search returns the substring between two strings, so
// the first index is moved to the character just after the first string.
int first = str.IndexOf("methods") + "methods".Length;
int last = str.LastIndexOf("methods");
string str2 = str.Substring(first, last - first);
System.Console.WriteLine("Substring between \"methods\" and \"methods\": '{0}'", str2);
// Keep the console window open in debug mode
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
/*
Output:
"Extension methods have all the capabilities of regular static methods."
Starts with "extension"? False
Starts with "extension"? True (ignoring case)
Ends with '.'? True
Substring between "methods" and "methods": ' have all the capabilities of regular static '
Press any key to exit.
*/
How to: Search Strings Using Regular Expressions
The class can be used to search strings. These searches can range in complexity from very simple to making full use of regular expressions. The following are two examples of string searching by using the class. For more information, see .
Example
The following code is a console application that performs a simple case-insensitive search of the strings in an array. The static method
performs the search given the string to search and a string that contains the search pattern. In this case, a third argument is used to indicate that case should be ignored. For more information, see .class TestRegularExpressions
{
static void Main()
{
string[] sentences =
{
"C# code",
"Chapter 2: Writing Code",
"Unicode",
"no match here"
};
string sPattern = "code";
foreach (string s in sentences)
{
System.Console.Write("{0,24}", s);
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern, System.Text.RegularExpressions.RegexOptions.IgnoreCase))
{
System.Console.WriteLine(" (match for '{0}' found)", sPattern);
}
else
{
System.Console.WriteLine();
}
}
// Keep the console window open in debug mode.
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
/* Output:
C# code (match for 'code' found)
Chapter 2: Writing Code (match for 'code' found)
Unicode (match for 'code' found)
no match here
*/
Example
The following code is a console application that uses regular expressions to validate the format of each string in an array. The validation requires that each string take the form of a telephone number in which three groups of digits are separated by dashes, the first two groups contain three digits, and the third group contains four digits. This is done by using the regular expression ^\\d{3}-\\d{3}-\\d{4}$
. For more information, see .
class TestRegularExpressionValidation
{
static void Main()
{
string[] numbers =
{
"123-555-0190",
"444-234-22450",
"690-555-0178",
"146-893-232",
"146-555-0122",
"4007-555-0111",
"407-555-0111",
"407-2-5555",
};
string sPattern = "^\\d{3}-\\d{3}-\\d{4}$";
foreach (string s in numbers)
{
System.Console.Write("{0,14}", s);
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern))
{
System.Console.WriteLine(" - valid");
}
else
{
System.Console.WriteLine(" - invalid");
}
}
// Keep the console window open in debug mode.
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
/* Output:
123-555-0190 - valid
444-234-22450 - invalid
690-555-0178 - valid
146-893-232 - invalid
146-555-0122 - valid
4007-555-0111 - invalid
407-555-0111 - valid
407-2-5555 - invalid
*/
How to: Determine Whether a String Represents a Numeric Value
To determine whether a string is a valid representation of a specified numeric type, use the static TryParse
method that is implemented by all primitive numeric types and also by types such as and . The following example shows how to determine whether "108" is a valid .
int i = 0;
string s = "108";
bool result = int.TryParse(s, out i); //i now = 108
If the string contains nonnumeric characters or the numeric value is too large or too small for the particular type you have specified, TryParse
returns false and sets the out parameter to zero. Otherwise, it returns true and sets the out parameter to the numeric value of the string.
Note
A string may contain only numeric characters and still not be valid for the type whose TryParse
method that you use. For example, "256" is not a valid value for byte
but it is valid for int
. "98.6" is not a valid value for int
but it is a valid decimal
.
Example
The following examples show how to use TryParse
with string representations of long
, byte
, and decimal
values.
string numString = "1287543"; //"1287543.0" will return false for a long
long number1 = 0;
bool canConvert = long.TryParse(numString, out number1);
if (canConvert == true)
Console.WriteLine("number1 now = {0}", number1);
else
Console.WriteLine("numString is not a valid long");
byte number2 = 0;
numString = "255"; // A value of 256 will return false
canConvert = byte.TryParse(numString, out number2);
if (canConvert == true)
Console.WriteLine("number2 now = {0}", number2);
else
Console.WriteLine("numString is not a valid byte");
decimal number3 = 0;
numString = "27.3"; //"27" is also a valid decimal
canConvert = decimal.TryParse(numString, out number3);
if (canConvert == true)
Console.WriteLine("number3 now = {0}", number3);
else
Console.WriteLine("number3 is not a valid decimal");
Robust Programming
Primitive numeric types also implement the Parse
static method, which throws an exception if the string is not a valid number. TryParse
is generally more efficient because it just returns false if the number is not valid.
.NET Framework Security
Always use the TryParse
or Parse
methods to validate user input from controls such as text boxes and combo boxes.
How to: Convert a String to a DateTime
It is common for programs to enable users to enter dates as string values. To convert a string-based date to a object, you can use the method or the static method, as shown in the following example.
Culture. Different cultures in the world write date strings in different ways. For example, in the US 01/20/2008 is January 20th, 2008. In France this will throw an InvalidFormatException. This is because France reads date-times as Day/Month/Year, and in the US it is Month/Day/Year.
Consequently, a string like 20/01/2008 will parse to January 20th, 2008 in France, and then throw an InvalidFormatException in the US.
To determine your current culture settings, you can use System.Globalization.CultureInfo.CurrentCulture.
See the example below for a simple example of converting a string to dateTime.
For more examples of date strings, see
.string dateTime = "01/08/2008 14:50:50.42";
DateTime dt = Convert.ToDateTime(dateTime);
Console.WriteLine("Year: {0}, Month: {1}, Day: {2}, Hour: {3}, Minute: {4}, Second: {5}, Millisecond: {6}",
dt.Year, dt.Month, dt.Day, dt.Hour, dt.Minute, dt.Second, dt.Millisecond);
// Specify exactly how to interpret the string.
IFormatProvider culture = new System.Globalization.CultureInfo("fr-FR", true);
// Alternate choice: If the string has been input by an end user, you might
// want to format it according to the current culture:
// IFormatProvider culture = System.Threading.Thread.CurrentThread.CurrentCulture;
DateTime dt2 = DateTime.Parse(dateTime, culture, System.Globalization.DateTimeStyles.AssumeLocal);
Console.WriteLine("Year: {0}, Month: {1}, Day: {2}, Hour: {3}, Minute: {4}, Second: {5}, Millisecond: {6}",
dt2.Year, dt2.Month, dt2.Day, dt2.Hour, dt2.Minute, dt2.Second, dt2.Millisecond
/* Output (assuming first culture is en-US and second is fr-FR):
Year: 2008, Month: 1, Day: 8, Hour: 14, Minute: 50, Second: 50, Millisecond: 420
Year: 2008, Month: 8, Day: 1, Hour: 14, Minute: 50, Second: 50, Millisecond: 420
Press any key to continue . . .
*/
Example
// Date strings are interpreted according to the current culture.
// If the culture is en-US, this is interpreted as "January 8, 2008",
// but if the user's computer is fr-FR, this is interpreted as "August 1, 2008"
string date = "01/08/2008";
DateTime dt = Convert.ToDateTime(date);
Console.WriteLine("Year: {0}, Month: {1}, Day: {2}", dt.Year, dt.Month, dt.Day);
// Specify exactly how to interpret the string.
IFormatProvider culture = new System.Globalization.CultureInfo("fr-FR", true);
// Alternate choice: If the string has been input by an end user, you might
// want to format it according to the current culture:
// IFormatProvider culture = System.Threading.Thread.CurrentThread.CurrentCulture;
DateTime dt2 = DateTime.Parse(date, culture, System.Globalization.DateTimeStyles.AssumeLocal);
Console.WriteLine("Year: {0}, Month: {1}, Day {2}", dt2.Year, dt2.Month, dt2.Day);
/* Output (assuming first culture is en-US and second is fr-FR):
Year: 2008, Month: 1, Day: 8
Year: 2008, Month: 8, Day 1
*/
How to: Convert Between Legacy Encodings and Unicode
In C#, all strings in memory are encoded as Unicode (UTF-16). When you bring data from storage into a string
object, the data is automatically converted to UTF-16. If the data contains only ASCII values from 0 through 127, the conversion requires no extra effort on your part. However, if the source text contains extended ASCII byte values (128 through 255), the extended characters will be interpreted by default according to the current code page. To specify that the source text should be interpreted according to a different code page, use the class as shown in the following example.
Example
The following example shows how to convert a text file that has been encoded in 8-bit ASCII, interpreting the source text according to Windows Code Page 737.
class ANSIToUnicode
{
static void Main()
{
// Create a file that contains the Greek work ψυχή (psyche) when interpreted by using
// code page 737 ((DOS) Greek). You can also create the file by using Character Map
// to paste the characters into Microsoft Word and then "Save As" by using the DOS
// (Greek) encoding. (Word will actually create a six-byte file by appending "\r\n" at the end.)
System.IO.File.WriteAllBytes(@"greek.txt", new byte[] { 0xAF, 0xAC, 0xAE, 0x9E });
// Specify the code page to correctly interpret byte values
Encoding encoding = Encoding.GetEncoding(737); //(DOS) Greek code page
byte[] codePageValues = System.IO.File.ReadAllBytes(@"greek.txt");
// Same content is now encoded as UTF-16
string unicodeValues = encoding.GetString(codePageValues);
// Show that the text content is still intact in Unicode string
// (Add a reference to System.Windows.Forms.dll)
System.Windows.Forms.MessageBox.Show(unicodeValues);
// Same content "ψυχή" is stored as UTF-8
System.IO.File.WriteAllText(@"greek_unicode.txt", unicodeValues);
// Conversion is complete. Show the bytes to prove the conversion.
Console.WriteLine("8-bit encoding byte values:");
foreach(byte b in codePageValues)
Console.Write("{0:X}-", b);
Console.WriteLine();
Console.WriteLine("Unicode values:");
string unicodeString = System.IO.File.ReadAllText("greek_unicode.txt");
System.Globalization.TextElementEnumerator enumerator =
System.Globalization.StringInfo.GetTextElementEnumerator(unicodeString);
while(enumerator.MoveNext())
{
string s = enumerator.GetTextElement();
int i = Char.ConvertToUtf32(s, 0);
Console.Write("{0:X}-", i);
}
Console.WriteLine();
// Keep the console window open in debug mode.
Console.Write("Press any key to exit.");
Console.ReadKey();
}
/*
* Output:
8-bit encoding byte values:
AF-AC-AE-9E
Unicode values:
3C8-3C5-3C7-3B7
*/
}
How to: Convert RTF to Plain Text
Rich Text Format (RTF) is a document format developed by Microsoft in the late 1980s to enable the exchange of documents across operating systems. Both Microsoft Word and WordPad can read and write RTF documents. In the .NET Framework, you can use the
control to create a word processor that supports RTF and enables a user to apply formatting to text in a WYSIWIG manner.You can also use the
control to programmatically remove the RTF formatting codes from a document and convert it to plain text. You do not need to embed the control in a Windows Form to perform this kind of operation.To use the RichTextBox control in a project
Add a reference to System.Windows.Forms.dll.
Add a using directive for the
System.Windows.Forms
namespace (optional).
Example
The following example converts a sample RTF file to plain text. The file contains RTF formatting (such as font information), four Unicode characters, and four extended ASCII characters. The example code opens the file, passes its content to a
as RTF, retrieves the content as text, displays the text in a , and outputs the text to a file in UTF-8 format.The MessageBox
and the output file contain the following text:
The Greek word for "psyche" is spelled ψυχή. The Greek letters are encoded in Unicode.
These characters are from the extended ASCII character set (Windows code page 1252): âäӑå
// Use NotePad to save the following RTF code to a text file in the same folder as
// your .exe file for this project. Name the file test.rtf.
/*
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fswiss\fcharset0 Arial;}
{\f1\fnil\fprq1\fcharset0 Courier New;}{\f2\fswiss\fprq2\fcharset0 Arial;}}
{\colortbl ;\red0\green128\blue0;\red0\green0\blue0;}
{\*\generator Msftedit 5.41.21.2508;}
\viewkind4\uc1\pard\f0\fs20 The \i Greek \i0 word for "psyche" is spelled \cf1\f1\u968?\u965?\u967?\u942?\cf2\f2 . The Greek letters are encoded in Unicode.\par
These characters are from the extended \b ASCII \b0 character set (Windows code page 1252): \'e2\'e4\u1233?\'e5\cf0\par }
*/
class ConvertFromRTF
{
static void Main()
{
// If your RTF file isn't in the same folder as the .exe file for the project,
// specify the path to the file in the following assignment statement.
string path = @"test.rtf";
//Create the RichTextBox. (Requires a reference to System.Windows.Forms.)
System.Windows.Forms.RichTextBox rtBox = new System.Windows.Forms.RichTextBox();
// Get the contents of the RTF file. When the contents of the file are
// stored in the string (rtfText), the contents are encoded as UTF-16.
string rtfText = System.IO.File.ReadAllText(path);
// Display the RTF text. This should look like the contents of your file.
System.Windows.Forms.MessageBox.Show(rtfText);
// Use the RichTextBox to convert the RTF code to plain text.
rtBox.Rtf = rtfText;
string plainText = rtBox.Text;
// Display the plain text in a MessageBox because the console can't
// display the Greek letters. You should see the following result:
// The Greek word for "psyche" is spelled ψυχή. The Greek letters are
// encoded in Unicode.
// These characters are from the extended ASCII character set (Windows
// code page 1252): âäӑå
System.Windows.Forms.MessageBox.Show(plainText);
// Output the plain text to a file, encoded as UTF-8.
System.IO.File.WriteAllText(@"output.txt", plainText);
}
}
RTF characters are encoded in eight bits. However, users can specify Unicode characters in addition to extended ASCII characters from specified code pages. Because the
property is of type , the characters are encoded as Unicode UTF-16. Any extended ASCII characters and Unicode characters from the source RTF document are correctly encoded in the text output.If you use the
method to write the text to disk, the text will be encoded as UTF-8 (without a Byte Order Mark).'프로그래밍 > C#' 카테고리의 다른 글
C# Programming Guide - Classes and Structs (0) | 2017.05.16 |
---|---|
C# Programming Guide - Main() and Command-Line Arguments (0) | 2017.05.10 |
C# Programming Guide - Types (0) | 2017.05.10 |
C# Programming Guide - Inside a C# Program (0) | 2017.05.10 |
C# Programming Guide - Statements, Expressions, and Operators (0) | 2017.05.10 |