On Tue, 31 Mar 2009 12:07:01 -0700, nickdu <nicknospa...@community.nospam> wrote:
> Is there a way to enumerate the files in a directory? The only method I > see > to get the files in a directory is Directory.GetFiles(). I don't want > to get > a list of all files in the directory but instead enumerate the files in a > directory. The reason is that there are hundreds of thousands of files > in > the directory I'm processing and the Directory.GetFiles() method is > taking > quite a bit of time to build the list. Instead I would rather have the > unmanaged functionality of FindFirst()/FindNext().
> Do I need to go through interop to get this functionality?
Pretty much, yes. You could "divide and conquer" the Directory.GetFiles() approach by careful crafting of search patterns to use, so that each call to GetFiles() didn't retrieve so many files at once. But it's probably easier to just use the unmanaged API, if that's really the behavior you want.
> On Tue, 31 Mar 2009 12:07:01 -0700, nickdu <nicknospa...@community.nospam> > wrote:
> > Is there a way to enumerate the files in a directory? The only method I > > see > > to get the files in a directory is Directory.GetFiles(). I don't want > > to get > > a list of all files in the directory but instead enumerate the files in a > > directory. The reason is that there are hundreds of thousands of files > > in > > the directory I'm processing and the Directory.GetFiles() method is > > taking > > quite a bit of time to build the list. Instead I would rather have the > > unmanaged functionality of FindFirst()/FindNext().
> > Do I need to go through interop to get this functionality?
> Pretty much, yes. You could "divide and conquer" the Directory.GetFiles() > approach by careful crafting of search patterns to use, so that each call > to GetFiles() didn't retrieve so many files at once. But it's probably > easier to just use the unmanaged API, if that's really the behavior you > want.
On a side note, this looks like something worthy of a feature request on MS Connect. Who knows, it might get into .NET 5.0 that way :)
Actually the Directory.GetFiles() method calls the Win32 FindFirstFile & FindNextFile functions to generate the result string array.
However, this is not always what we want - I don't want the thread being blocked for 10 seconds to get a huge string array while all I want to do is process the files one by one. I totally understand the pain so I have made a DirectoryEnumerator class to solve the problem.
The basic idea is to implement the IEnumerable<string> interface in the DirectoryEnumerator class, which provides an IEnumerator<string> to enable using foreach loop to get the filenames one at a time. Something looks like this:
foreach (string file in new DirectoryEnumerator(@"C:\Windows\*.log", Mode.File)) { // process the file
}
The enumerator will find the next file only when the MoveNext mothod of the IEnumerator interface is called. The shortage of this implementation is you have forward only access, no going back, no access by index. But if you want random or by-index access, you can just go back to the GetFiles method.
Here is my proof of concept implementation of the DirectoryEnumerator class, you can make improvements based on it to meet your requirements.
using System; using System.IO; using System.Text; using System.Collections; using System.Collections.Generic; using System.Security.Permissions; using System.Runtime.InteropServices; using System.Runtime.ConstrainedExecution; using Microsoft.Win32.SafeHandles; using System.ComponentModel;
public class DirectoryEnumerator : IEnumerable<string> { #region The Enumerator
public struct Enumerator : IEnumerator<string> { #region Private members
protected override bool ReleaseHandle() { // Close the search handle. return Win32Native.FindClose(base.handle); }
}
internal static class Win32Native { [Serializable, StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto), BestFitMapping(false)] internal class WIN32_FIND_DATA { internal int dwFileAttributes; internal int ftCreationTime_dwLowDateTime; internal int ftCreationTime_dwHighDateTime; internal int ftLastAccessTime_dwLowDateTime; internal int ftLastAccessTime_dwHighDateTime; internal int ftLastWriteTime_dwLowDateTime; internal int ftLastWriteTime_dwHighDateTime; internal int nFileSizeHigh; internal int nFileSizeLow; internal int dwReserved0; internal int dwReserved1; [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)] internal string cFileName; [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)] internal string cAlternateFileName; }
internal const int ERROR_NO_MORE_FILES = 18; internal const int ERROR_FILE_NOT_FOUND = 2; internal const int FILE_ATTRIBUTE_DIRECTORY = 0x00000010;
}
If you have any further questions regarding this issue, please feel free to post here.
Regards,
Jie Wang (jie...@online.microsoft.com, remove 'online.')
Microsoft Online Community Support
Delighting our customers is our #1 priority. We welcome your comments and suggestions about how we can improve the support we provide to you. Please feel free to let my manager know what you think of the level of service provided. You can send feedback directly to my manager at: msd...@microsoft.com.
Note: MSDN Managed Newsgroup support offering is for non-urgent issues where an initial response from the community or a Microsoft Support Engineer within 2 business days is acceptable. Please note that each follow up response may take approximately 2 business days as the support professional working with you may need further investigation to reach the most efficient resolution. The offering is not appropriate for situations that require urgent, real-time or phone-based interactions. Issues of this nature are best handled working with a dedicated Microsoft Support Engineer by contacting Microsoft Customer Support Services (CSS) at http://msdn.microsoft.com/en-us/subscriptions/aa948874.aspx ================================================== This posting is provided "AS IS" with no warranties, and confers no rights.
> Actually the Directory.GetFiles() method calls the Win32 FindFirstFile & > FindNextFile functions to generate the result string array.
> However, this is not always what we want - I don't want the thread being > blocked for 10 seconds to get a huge string array while all I want to do is > process the files one by one. I totally understand the pain so I have made > a DirectoryEnumerator class to solve the problem.
> The basic idea is to implement the IEnumerable<string> interface in the > DirectoryEnumerator class, which provides an IEnumerator<string> to enable > using foreach loop to get the filenames one at a time. Something looks like > this:
> foreach (string file in new DirectoryEnumerator(@"C:\Windows\*.log", > Mode.File)) > { > // process the file > }
> The enumerator will find the next file only when the MoveNext mothod of the > IEnumerator interface is called. The shortage of this implementation is you > have forward only access, no going back, no access by index. But if you > want random or by-index access, you can just go back to the GetFiles method.
> Here is my proof of concept implementation of the DirectoryEnumerator > class, you can make improvements based on it to meet your requirements.
> using System; > using System.IO; > using System.Text; > using System.Collections; > using System.Collections.Generic; > using System.Security.Permissions; > using System.Runtime.InteropServices; > using System.Runtime.ConstrainedExecution; > using Microsoft.Win32.SafeHandles; > using System.ComponentModel;
> public class DirectoryEnumerator : IEnumerable<string> > { > #region The Enumerator
> public struct Enumerator : IEnumerator<string> > { > #region Private members
> Actually the Directory.GetFiles() method calls the Win32 FindFirstFile & > FindNextFile functions to generate the result string array.
> However, this is not always what we want - I don't want the thread being > blocked for 10 seconds to get a huge string array while all I want to do > is > process the files one by one. I totally understand the pain so I have made > a DirectoryEnumerator class to solve the problem.
> The basic idea is to implement the IEnumerable<string> interface in the > DirectoryEnumerator class, which provides an IEnumerator<string> to enable > using foreach loop to get the filenames one at a time. Something looks > like > this:
> foreach (string file in new DirectoryEnumerator(@"C:\Windows\*.log", > Mode.File)) > { > // process the file > }
> The enumerator will find the next file only when the MoveNext mothod of > the > IEnumerator interface is called. The shortage of this implementation is > you > have forward only access, no going back, no access by index. But if you > want random or by-index access, you can just go back to the GetFiles > method.
> Here is my proof of concept implementation of the DirectoryEnumerator > class, you can make improvements based on it to meet your requirements.
> using System; > using System.IO; > using System.Text; > using System.Collections; > using System.Collections.Generic; > using System.Security.Permissions; > using System.Runtime.InteropServices; > using System.Runtime.ConstrainedExecution; > using Microsoft.Win32.SafeHandles; > using System.ComponentModel;
> public class DirectoryEnumerator : IEnumerable<string> > { > #region The Enumerator
> public struct Enumerator : IEnumerator<string> > { > #region Private members
> internal const int ERROR_NO_MORE_FILES = 18; > internal const int ERROR_FILE_NOT_FOUND = 2; > internal const int FILE_ATTRIBUTE_DIRECTORY = 0x00000010; > }
> If you have any further questions regarding this issue, please feel free > to > post here.
> Regards,
> Jie Wang (jie...@online.microsoft.com, remove 'online.')
> Microsoft Online Community Support
> Delighting our customers is our #1 priority. We welcome your comments and > suggestions about how we can improve the support we provide to you. Please > feel free to let my manager know what you think of the level of service > provided. You can send feedback directly to my manager at: > msd...@microsoft.com.
> Note: MSDN Managed Newsgroup support offering is for non-urgent issues > where an initial response from the community or a Microsoft Support > Engineer within 2 business days is acceptable. Please note that each > follow > up response may take approximately 2 business days as the support > professional working with you may need further investigation to reach the > most efficient resolution. The offering is not appropriate for situations > that require urgent, real-time or phone-based interactions. Issues of this > nature are best handled working with a dedicated Microsoft Support > Engineer > by contacting Microsoft Customer Support Services (CSS) at > http://msdn.microsoft.com/en-us/subscriptions/aa948874.aspx > ================================================== > This posting is provided "AS IS" with no warranties, and confers no > rights.
Jie Wang (jie...@online.microsoft.com, remove 'online.')
Microsoft Online Community Support
Delighting our customers is our #1 priority. We welcome your comments and suggestions about how we can improve the support we provide to you. Please feel free to let my manager know what you think of the level of service provided. You can send feedback directly to my manager at: msd...@microsoft.com.
Note: MSDN Managed Newsgroup support offering is for non-urgent issues where an initial response from the community or a Microsoft Support Engineer within 2 business days is acceptable. Please note that each follow up response may take approximately 2 business days as the support professional working with you may need further investigation to reach the most efficient resolution. The offering is not appropriate for situations that require urgent, real-time or phone-based interactions. Issues of this nature are best handled working with a dedicated Microsoft Support Engineer by contacting Microsoft Customer Support Services (CSS) at http://msdn.microsoft.com/en-us/subscriptions/aa948874.aspx ================================================== This posting is provided "AS IS" with no warranties, and confers no rights.
Yes this is essential to processing a large number of files within a directory.
Hope the code sample helps.
Thanks,
Jie Wang (jie...@online.microsoft.com, remove 'online.')
Microsoft Online Community Support
Delighting our customers is our #1 priority. We welcome your comments and suggestions about how we can improve the support we provide to you. Please feel free to let my manager know what you think of the level of service provided. You can send feedback directly to my manager at: msd...@microsoft.com.
Note: MSDN Managed Newsgroup support offering is for non-urgent issues where an initial response from the community or a Microsoft Support Engineer within 2 business days is acceptable. Please note that each follow up response may take approximately 2 business days as the support professional working with you may need further investigation to reach the most efficient resolution. The offering is not appropriate for situations that require urgent, real-time or phone-based interactions. Issues of this nature are best handled working with a dedicated Microsoft Support Engineer by contacting Microsoft Customer Support Services (CSS) at http://msdn.microsoft.com/en-us/subscriptions/aa948874.aspx ================================================== This posting is provided "AS IS" with no warranties, and confers no rights.
>> Actually the Directory.GetFiles() method calls the Win32 FindFirstFile & >> FindNextFile functions to generate the result string array.
>> However, this is not always what we want - I don't want the thread being >> blocked for 10 seconds to get a huge string array while all I want to do >> is >> process the files one by one. I totally understand the pain so I have made >> a DirectoryEnumerator class to solve the problem.
>> The basic idea is to implement the IEnumerable<string> interface in the >> DirectoryEnumerator class, which provides an IEnumerator<string> to enable >> using foreach loop to get the filenames one at a time. Something looks >> like >> this:
>> foreach (string file in new DirectoryEnumerator(@"C:\Windows\*.log", >> Mode.File)) >> { >> // process the file >> }
>> The enumerator will find the next file only when the MoveNext mothod of >> the >> IEnumerator interface is called. The shortage of this implementation is >> you >> have forward only access, no going back, no access by index. But if you >> want random or by-index access, you can just go back to the GetFiles >> method.
>> Here is my proof of concept implementation of the DirectoryEnumerator >> class, you can make improvements based on it to meet your requirements.
>> using System; >> using System.IO; >> using System.Text; >> using System.Collections; >> using System.Collections.Generic; >> using System.Security.Permissions; >> using System.Runtime.InteropServices; >> using System.Runtime.ConstrainedExecution; >> using Microsoft.Win32.SafeHandles; >> using System.ComponentModel;
>> public class DirectoryEnumerator : IEnumerable<string> >> { >> #region The Enumerator
>> public struct Enumerator : IEnumerator<string> >> { >> #region Private members
>> internal const int ERROR_NO_MORE_FILES = 18; >> internal const int ERROR_FILE_NOT_FOUND = 2; >> internal const int FILE_ATTRIBUTE_DIRECTORY = 0x00000010; >> }
>> If you have any further questions regarding this issue, please feel free >> to >> post here.
>> Regards,
>> Jie Wang (jie...@online.microsoft.com, remove 'online.')
>> Microsoft Online Community Support
>> Delighting our customers is our #1 priority. We welcome your comments and >> suggestions about how we can improve the support we provide to you. Please >> feel free to let my manager know what you think of the level of service >> provided. You can send feedback directly to my manager at: >> msd...@microsoft.com.
>> Note: MSDN Managed Newsgroup support offering is for non-urgent issues >> where an initial response from the community or a Microsoft Support >> Engineer within 2 business days is acceptable. Please note that each >> follow >> up response may take approximately 2 business days as the support >> professional working with you may need further investigation to reach the >> most efficient resolution. The offering is not appropriate for situations >> that require urgent, real-time or phone-based
How would i go with using your class if i just wanted a count of all Folders on a drive ?
Is the pattern for a Directory *.dir ?
jiewa wrote:
Hi Nick,Actually the Directory. 01-Apr-09
Hi Nick,
Actually the Directory.GetFiles() method calls the Win32 FindFirstFile & FindNextFile functions to generate the result string array.
However, this is not always what we want - I don't want the thread being blocked for 10 seconds to get a huge string array while all I want to do is process the files one by one. I totally understand the pain so I have made a DirectoryEnumerator class to solve the problem.
The basic idea is to implement the IEnumerable<string> interface in the DirectoryEnumerator class, which provides an IEnumerator<string> to enable using foreach loop to get the filenames one at a time. Something looks like this:
foreach (string file in new DirectoryEnumerator(@"C:\Windows\*.log", Mode.File)) { // process the file
}
The enumerator will find the next file only when the MoveNext mothod of the IEnumerator interface is called. The shortage of this implementation is you have forward only access, no going back, no access by index. But if you want random or by-index access, you can just go back to the GetFiles method.
Here is my proof of concept implementation of the DirectoryEnumerator class, you can make improvements based on it to meet your requirements.
using System; using System.IO; using System.Text; using System.Collections; using System.Collections.Generic; using System.Security.Permissions; using System.Runtime.InteropServices; using System.Runtime.ConstrainedExecution; using Microsoft.Win32.SafeHandles; using System.ComponentModel;
public class DirectoryEnumerator : IEnumerable<string> { #region The Enumerator
public struct Enumerator : IEnumerator<string> { #region Private members
protected override bool ReleaseHandle() { // Close the search handle. return Win32Native.FindClose(base.handle); }
}
internal static class Win32Native { [Serializable, StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto), BestFitMapping(false)] internal class WIN32_FIND_DATA { internal int dwFileAttributes; internal int ftCreationTime_dwLowDateTime; internal int ftCreationTime_dwHighDateTime; internal int ftLastAccessTime_dwLowDateTime; internal int ftLastAccessTime_dwHighDateTime; internal int ftLastWriteTime_dwLowDateTime; internal int ftLastWriteTime_dwHighDateTime; internal int nFileSizeHigh; internal int nFileSizeLow; internal int dwReserved0; internal int dwReserved1; [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)] internal string cFileName; [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)] internal string cAlternateFileName; }
internal const int ERROR_NO_MORE_FILES = 18; internal const int ERROR_FILE_NOT_FOUND = 2; internal const int FILE_ATTRIBUTE_DIRECTORY = 0x00000010;
}
If you have any further questions regarding this issue, please feel free to post here.
Regards,
Jie Wang (jie...@online.microsoft.com, remove 'online.')
Microsoft Online Community Support
Delighting our customers is our #1 priority. We welcome your comments and suggestions about how we can improve the support we provide to you. Please feel free to let my manager know what you think of the level of service provided. You can send feedback directly to my manager at: msd...@microsoft.com.
Note: MSDN Managed Newsgroup support offering is for non-urgent issues where an initial response from the community or a Microsoft Support Engineer within 2 business days is acceptable. Please note that each follow up response may take approximately 2 business days as the support professional working with you may need further investigation to reach the most efficient resolution. The offering is not appropriate for situations that require urgent, real-time or phone-based interactions. Issues of this nature are best handled working with a dedicated Microsoft Support Engineer by contacting Microsoft Customer Support Services (CSS) at http://msdn.microsoft.com/en-us/subscriptions/aa948874.aspx ================================================== This posting is provided "AS IS" with no warranties, and confers no rights.
Previous Posts In This Thread:
On 31 March 2009 15:07
nicknospamd wrote:
Enumerating a directory, FindFirst()/FindNext()? Is there a way to enumerate the files in a directory? The only method I see to get the files in a directory is Directory.GetFiles(). I don't want to get a list of all files in the directory but instead enumerate the files in a directory. The reason is that there are hundreds of thousands of files in the directory I'm processing and the Directory.GetFiles() method is taking quite a bit of time to build the list. Instead I would rather have
...