It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
avatar
Onox: Imho, the preallocation should be made optional in the script [..] I don't think that preallocation is a good idea on HDDs that use SMR [..]
Furthermore, I was told not to defragment HDDs with drive-managed SMR because since the SMR aspect is then hidden to the operating system, it cannot properly defragment the drive
avatar
phaolo: Options are always good, but preallocation just reserves space, it doesn't perform any additional write.
Disabling it will just mean that you'll have to defragment the drive at some point, and that will be actually worse for SMR.
The OS doesn't worry about the physical structure of the disk if there's a correct driver to manage it.
Well, we are at least two people in this discussion to have experienced very slow preallocation because on NTFS it does have to write the entire space that will be taken up by the files. That’s why timppu had to comment it out

And what you say about SMR conflicts with what I had been told. I don’t know then. But it makes sense to me that if the drive completely hides the fact that it uses SMR, the OS can’t have a correct understanding of where and how the files are actually stored on the disk. Do operating systems really have specific drivers for every model of HDD? (I have never had to install any driver for that)
avatar
Onox: Well, we are at least two people in this discussion to have experienced very slow preallocation because on NTFS it does have to write the entire space that will be taken up by the files. That’s why timppu had to comment it out
I am maybe a bit special case because I mainly use gogrepoc.py in a Linux machine (mostly my Raspberry Pi4) where I download the games to an external NTFS USB partition. It seems the ntfs-3g driver in Linux has some issue with that preallocation, maybe it just times out when it is waiting for the preallocation to finish.

I don't have an issue if I use gogrepoc.py preallocation either in Windows (to the same external NTFS USB HDD), or when downloading the games to an ext4 partition in Linux. It is just that the combination of Linux + NTFS that has some issue for preallocation, at least for me...

I'll check if I can post the lines I commented out.

By the way good information on the SMR, I wasn't aware of e.g. the defragmentation issue. I learned about it a couple of years ago when I was looking for new big archival HDDs, and SMR sounded exactly something I want to avoid because I quite often copy lots of very big files across my archival hard drives, sometimes even several terabytes in one swoop. After all, the best way to defragment your data is to copy it over from one partition to an empty one. :)

SMR sounds suitable mainly for static archives where you add a little bit of data every now and then to it, and that's it. For big file operations, SMR sounds like a very bad idea.

I think the HDD manufacturers have tried to hide it whether their HDDs use SMR, but now they have to be more open about it as there was the scandal about at least Seagate, and possibly some other HDD manufacturer, using SMR even on hard drives that they claimed to be suitable for NAS use. People had serious problems trying to use those "NAS HDDs with SMR" in e.g. RAID setups.

I guess that would kinda explain why USB HDDs tend to be cheaper than internal HDDs, if the former use SMR more often...
Post edited January 19, 2021 by timppu
avatar
Onox: Well, we are at least two people in this discussion to have experienced very slow preallocation because on NTFS it does have to write the entire space that will be taken up by the files. That’s why timppu had to comment it out
avatar
timppu: I am maybe a bit special case because I mainly use gogrepoc.py in a Linux machine (mostly my Raspberry Pi4) where I download the games to an external NTFS USB partition. It seems the ntfs-3g driver in Linux has some issue with that preallocation, maybe it just times out when it is waiting for the preallocation to finish.

I don't have an issue if I use gogrepoc.py preallocation either in Windows (to the same external NTFS USB HDD), or when downloading the games to an ext4 partition in Linux. It is just that the combination of Linux + NTFS that has some issue for preallocation, at least for me...
Thanks, I was also wondering about this ! It makes sense that it works normally on Windows and with other filesystems. Like you, I download on Linux to an external NTFS HDD (usually, I download in memory and then I copy the files instead of directly downloading on the external drive). Maybe I should start downloading on Windows instead, but it will be less convenient for me. I think I’m going to continue downloading manually for some time at least
avatar
Onox: .... I'd like to know, would it be possible for you to post your script file somewhere so that I can see what you commented out exactly? Thanks
I disabled preallocation in my copy of Kalanyr's master script as I was downloading to a network connected NAS configured as a RAID. I commented out all lines from 1619 to 1644 AND lines 1651 to 1706 inclusive. Here's what I commented out:

# if file_sz < sz: #preallocate extra space
# if platform.system() == "Windows":
# try:
# info("increasing preallocation to '%d' bytes for '%s' " % (sz,downloading_path))
# preH = ctypes.windll.kernel32.CreateFileW(compat_downloading_path, GENERIC_READ | GENERIC_WRITE, 0, None, OPEN_EXISTING, 0, None)
# if preH==-1:
# warn("could not get filehandle")
# raise OSError()
# c_sz = ctypes.wintypes.LARGE_INTEGER(sz)
# ctypes.windll.kernel32.SetFilePointerEx(preH,c_sz,None,FILE_BEGIN)
# ctypes.windll.kernel32.SetEndOfFile(preH)
# ctypes.windll.kernel32.CloseHandle(preH)
# except:
# log_exception('')
# warn("preallocation failed")
# if preH != -1:
# info('failed - closing outstanding handle')
# ctypes.windll.kernel32.CloseHandle(preH)
# else:
# if sys.version_info[0] >= 4 or (sys.version_info[0] == 3 and sys.version_info[1] >= 3):
# info("increasing preallocation to '%d' bytes for '%s' using posix_fallocate " % (sz,downloading_path))
# with open(downloading_path, "r+b") as f:
# try:
# os.posix_fallocate(f.fileno(),0,sz)
# except:
# warn("posix preallocation failed")

AND

# if file_sz < sz: #preallocate extra space
# if platform.system() == "Windows":
# try:
# preH = -1
# info("increasing preallocation to '%d' bytes for '%s' " % (sz,downloading_path))
# preH = ctypes.windll.kernel32.CreateFileW(compat_downloading_path, GENERIC_READ | GENERIC_WRITE, 0, None, OPEN_EXISTING, 0, None)
# if preH==-1:
# warn("could not get filehandle")
# raise OSError()
# c_sz = ctypes.wintypes.LARGE_INTEGER(sz)
# ctypes.windll.kernel32.SetFilePointerEx(preH,c_sz,None,FILE_BEGIN)
# ctypes.windll.kernel32.SetEndOfFile(preH)
# ctypes.windll.kernel32.CloseHandle(preH)
# except:
# log_exception('')
# warn("preallocation failed")
# if preH != -1:
# info('failed - closing outstanding handle')
# ctypes.windll.kernel32.CloseHandle(preH)
# else:
# if sys.version_info[0] >= 4 or (sys.version_info[0] == 3 and sys.version_info[1] >= 3):
# info("increasing preallocation to '%d' bytes for '%s' using posix_fallocate " % (sz,downloading_path))
# with open(downloading_path, "r+b") as f:
# try:
# os.posix_fallocate(f.fileno(),0,sz)
# except:
# warn("posix preallocation failed")
# else:
# if platform.system() == "Windows":
# try:
# preH = -1
# info("preallocating '%d' bytes for '%s' " % (sz,downloading_path))
# preH = ctypes.windll.kernel32.CreateFileW(compat_downloading_path, GENERIC_READ | GENERIC_WRITE, 0, None, CREATE_NEW, 0, None)
# if preH==-1:
# warn("could not get filehandle")
# raise OSError()
# c_sz = ctypes.wintypes.LARGE_INTEGER(sz)
# ctypes.windll.kernel32.SetFilePointerEx(preH,c_sz,None,FILE_BEGIN)
# ctypes.windll.kernel32.SetEndOfFile(preH)
# ctypes.windll.kernel32.CloseHandle(preH)
# #DEVNULL = open(os.devnull, 'wb')
# #subprocess.call(["fsutil","file","createnew",path,str(sz)],stdout=DEVNULL,stderr=DEVNULL)
# except:
# log_exception('')
# warn("preallocation failed")
# if preH != -1:
# info('failed - closing outstanding handle')
# ctypes.windll.kernel32.CloseHandle(preH)
# else:
# if sys.version_info[0] >= 4 or (sys.version_info[0] == 3 and sys.version_info[1] >= 3):
# info("attempting preallocating '%d' bytes for '%s' using posix_fallocate " % (sz,downloading_path))
# with open(downloading_path, "wb") as f:
# try:
# os.posix_fallocate(f.fileno(),0,sz)
# except:
# warn("posix preallocation failed")
Post edited January 19, 2021 by ikrananka
avatar
Onox: .... I'd like to know, would it be possible for you to post your script file somewhere so that I can see what you commented out exactly? Thanks
avatar
ikrananka: I disabled preallocation in my copy of Kalanyr's master script as I was downloading to a network connected NAS configured as a RAID. I commented out all lines from 1619 to 1644 AND lines 1651 to 1706 inclusive. Here's what I commented out:

...
Thank you ! I'd love to install a NAS at home, but it seems a bit difficult in the current circumstances (and expensive), and I need to convince my parents

Thanks everyone for your comments and help <3
avatar
Onox: very slow preallocation because on NTFS it does have to write the entire space that will be taken up by the files.
This feels very weird to me. If it's true, I wonder why.

avatar
Onox: Do operating systems really have specific drivers for every model of HDD?
Yes, even if they probably get downloaded and installed automatically by the system, nowadays (plus, generic drivers exist).
As an example of this, I recall that the old Win7 couldn't recognize newer SSDs, unless you manually slipstreamed\loaded their drivers before the OS setup.

avatar
Onox: if the drive completely hides the fact that it uses SMR, the OS can’t have a correct understanding of where and how the files are actually stored on the disk.
With normal file operations the OS doesn't need to care about the physical layout.
For example it just tells the driver\controller to create a file on the HDD. The controller will then take care of deciding where to physically place it and report the success\failure of the operation.
Post edited January 19, 2021 by phaolo
avatar
Onox: Thanks, I was also wondering about this ! It makes sense that it works normally on Windows and with other filesystems. Like you, I download on Linux to an external NTFS HDD (usually, I download in memory and then I copy the files instead of directly downloading on the external drive). Maybe I should start downloading on Windows instead, but it will be less convenient for me. I think I’m going to continue downloading manually for some time at least
Ok so you have the same setup as me then. I was going to warn that I think I commented out only the preallocation lines relevant to Linux (and left the Windows-related preallocation lines untouched), but since you are also downloading in Linux, I guess this works for you too.

First of all, I am using the development version, gogrepoc.py, not sure how different the master gogrepo.py is.

So in gogrepoc.py, I commented out these lines, in three different parts of the script (which are close to each other) under the "downloader worker thread main loop":

## else:
## if sys.version_info[0] >= 4 or (sys.version_info[0] == 3 and sys.version_info[1] >= 3):
## info("increasing preallocation to '%d' bytes for '%s' using posix_fallocate " % (sz,downloading_path))
## with open(downloading_path, "r+b") as f:
## try:
## os.posix_fallocate(f.fileno(),0,sz)
## except Exception:
## warn("posix preallocation failed")
Post edited January 19, 2021 by timppu
Hi,

I'm in a similar situation next time, I want to get the library locally on a linux box. Using python, you can call the external programm ntfsfallocate from ntfs-3g package, see URL manpages.ubuntu.com/manpages/cosmic/man8/ntfsfallocate.8.html <ntfsfallocate.8> manual. I have no idea about python packages providing ntfs-3g tools. Further, I didn't found any problems with this tool. Does it help?
NTFS write support on Linux is relatively new so yeah it's not particularly surprising that there might be some issues (particularly with pre-allocation which is unnecessary or impossible on most Linux file systems).

I'm going to change the pre-allocation to detect the file system type on the target drive at some point and only pre-allocate if it's a drive type where that's necessary (NTFS or a FAT variant) and make it optional at the same time.

However be very aware that if you're using a NTFS/FAT drive on windows the nature of threading, relatively frequent updates that change file sizes (for the first / last file in archive sets) means that the drive you use will get heavily fragmented quickly as far as such things go.

As far as I can tell SMR disks would be pretty poorly suited for maintaining an up to date archive of GOG installers in any case, the main activity is writing (keeping the archive up to date) not reading (I suspect most people use Galaxy for day to day management) and it's likely large amounts of data will be written over relatively short periods of time (when you do an update) and the nature of threading makes those writes semi-random, the pre-allocation might make that worse but you're looking at a doubling at worst.

SMR drives seem intended for replacing tape backups where the unspoken use case is Write Once Read Never (ie you commit data to it and then never change that data and hopefully never even look at it again because if you do it means either an audit or major data loss) though SMRs seem more suitable for Write Rarely Read Occassionally than tape.
avatar
Kalanyr: NTFS write support on Linux is relatively new so yeah it's not particularly surprising that there might be some issues (particularly with pre-allocation which is unnecessary or impossible on most Linux file systems).
Most Linux filesystems would be ext4 (maybe some would use xfs, but if you just let the installer go without telling it what to do, you will end up with ext4, at least for Ubuntu 18.04 and before).

So yeah, if you go low level enough to do ntfs specific stuff on a filesystem that is not ntfs, I'm guessing nothing good would come out of it.

avatar
Kalanyr: I'm going to change the pre-allocation to detect the file system type on the target drive at some point and only pre-allocate if it's a drive type where that's necessary (NTFS or a FAT variant) and make it optional at the same time.

However be very aware that if you're using a NTFS/FAT drive on windows the nature of threading, relatively frequent updates that change file sizes (for the first / last file in archive sets) means that the drive you use will get heavily fragmented quickly as far as such things go.
Something to keep in mind however is that increasingly, people use SSDs (maybe not to store terabytes of data at home yet, but that day may come in the not so distant future) and SSDs don't suffer heavy performance penalties from fragmentation the way HDDs do, because they don't incur seek time penalties from reading non-continuous segments.

Heck, for those with a game collection under 1TB, they could affordably back their collection up on an SSD right now.

You might want to structure that part of your code in such a way that you can rip it out in a couple of years.
Post edited January 28, 2021 by Magnitus
avatar
Kalanyr: NTFS write support on Linux is relatively new so yeah it's not particularly surprising that there might be some issues (particularly with pre-allocation which is unnecessary or impossible on most Linux file systems).
avatar
Magnitus: Most Linux filesystems would be ext4 (maybe some would use xfs, but if you just let the installer go without telling it what to do, you will end up with ext4, at least for Ubuntu 18.04 and before).

So yeah, if you go low level enough to do ntfs specific stuff on a filesystem that is not ntfs, I'm guessing nothing good would come out of it.

avatar
Kalanyr: I'm going to change the pre-allocation to detect the file system type on the target drive at some point and only pre-allocate if it's a drive type where that's necessary (NTFS or a FAT variant) and make it optional at the same time.

However be very aware that if you're using a NTFS/FAT drive on windows the nature of threading, relatively frequent updates that change file sizes (for the first / last file in archive sets) means that the drive you use will get heavily fragmented quickly as far as such things go.
avatar
Magnitus: Something to keep in mind however is that increasingly, people use SSDs (maybe not to store terabytes of data at home yet, but that day may come in the not so distant future) and SSDs don't suffer heavy performance penalties from fragmentation the way HDDs do, because they don't incur seek time penalties from reading non-continuous segments.

Heck, for those with a game collection under 1TB, they could affordably back their collection up on an SSD right now.

You might want to structure that part of your code in such a way that you can rip it out in a couple of years.
Hmm, you make a good point there wrt to SSDs my GOG collection is like ~7 TB at this point so SSDs aren't really a cost effective method for me (I do have a 7 TB external SSD but I'd much prefer to use it for Steam games that can benefit from that rather than archives)) but that will be a simple matter of changing the option to default to off.

<End Reply>
<Begin General>
On that note with Linux potentially having some flaws in it's NTFS drivers with Pre-allocation, I'd like to know if the pre-allocation on a exFAT/NTFS drives is an issue across multiple Linux distros or specific to the one timppu is using, if it's a generic problem, I'll also disable file allocation if a *nix OS is detected if it's writing to exFAT and/or NTFS depending if one or both of them have this problem generically.
avatar
Kalanyr: On that note with Linux potentially having some flaws in it's NTFS drivers with Pre-allocation, I'd like to know if the pre-allocation on a exFAT/NTFS drives is an issue across multiple Linux distros or specific to the one timppu is using, if it's a generic problem, I'll also disable file allocation if a *nix OS is detected if it's writing to exFAT and/or NTFS depending if one or both of them have this problem generically.
I don't have much empirical experience with it, but my understanding is that its ntfs that causes runaway fragmentation and not Windows (Windows has fragmentation problems because it uses ntfs).

So I think its safe to only pre-allocate if the filesystem is ntfs (if you want to go at that level of detail, though the more "hands on" you are at a lower level, the more compatibility problems you'll encounter across versions/platforms).

I don't think most Linux users would want to use ntfs for serious backups (they have access to filesystems that are simply better and also better supported) so I think trying to troubleshoot ntfs on Linux if very much a rare edge case.

If it is simpler (I think it would be), you could simply default to: pre-allocate with Windows and never pre-allocate with Linux (and if you stumble upon that user that wants to write to to an ntfs filesystem on Linux, then he'd have to deal with the fragmentation himself, possibly using some defrag tool).
Post edited January 28, 2021 by Magnitus
avatar
Kalanyr: On that note with Linux potentially having some flaws in it's NTFS drivers with Pre-allocation, I'd like to know if the pre-allocation on a exFAT/NTFS drives is an issue across multiple Linux distros or specific to the one timppu is using, if it's a generic problem, I'll also disable file allocation if a *nix OS is detected if it's writing to exFAT and/or NTFS depending if one or both of them have this problem generically.
avatar
Magnitus: I don't have much empirical experience with it, but my understanding is that its ntfs that causes runaway fragmentation and not Windows (Windows has fragmentation problems because it uses ntfs).

So I think its safe to only pre-allocate if the filesystem is ntfs (if you want to go at that level of detail, though the more "hands on" you are at a lower level, the more compatibility problems you'll encounter across versions/platforms).

I don't think most Linux users would want to use ntfs for serious backups (they have access to filesystems that are simply better and also better supported) so I think trying to troubleshoot ntfs on Linux if very much a rare edge case.

If it is simpler (I think it would be), you could simply default to: pre-allocate with Windows and never pre-allocate with Linux (and if you stumble upon that user that wants to write to to an ntfs filesystem on Linux, then he'd have to deal with the fragmentation himself, possibly using some defrag tool).
You're missing the use case of a shared drive for someone who uses Windows/Linux/Mac. exFAT (which is the standard for a shared Linux/Mac/Windows external drive) also fragments (which makes sense, the fragmentation of NTFS was inherited from FAT32). Those things mean you can't just use the OS as a shortcut (though that's what I'm doing now to a first approximation).
avatar
Kalanyr: You're missing the use case of a shared drive for someone who uses Windows/Linux/Mac. exFAT (which is the standard for a shared Linux/Mac/Windows external drive) also fragments (which makes sense, the fragmentation of NTFS was inherited from FAT32). Those things mean you can't just use the OS as a shortcut (though that's what I'm doing now to a first approximation).
That's a fair point, although I'd say that someone doing backups on a shared Windows filesystem drive is probably better off doing their backup with Windows and not Linux.

If they are doing it on Linux with a Windows filesystem, its essentially means that they are doing an important (at least from an emotional standpoint) backup using a less dependable OS/filsystem coupling.

Also, I guess its different things for different people, but for me, a backup is not a source from which I would fetch my games (I'd still use GOG.com for that), except maybe to verify the integrity with a checksum. A backup to something to have on hands in case GOG goes offline. Correctness > Convenient of Access, imo.

Anyways, food for thought. Its your show. Personally, I would not go too deep in troubleshooting Windows filesystems on Linux. I'd try to make it as correct as possible (as in, the files are correctly stored) from a high level and let the user defrag his/her drive himself/herself if he/she really want to go there.
Post edited January 28, 2021 by Magnitus
avatar
Kalanyr: On that note with Linux potentially having some flaws in it's NTFS drivers with Pre-allocation, I'd like to know if the pre-allocation on a exFAT/NTFS drives is an issue across multiple Linux distros or specific to the one timppu is using, if it's a generic problem, I'll also disable file allocation if a *nix OS is detected if it's writing to exFAT and/or NTFS depending if one or both of them have this problem generically.
I started using your fork rather than the original recently and I had to comment out all the preallocation stuff otherwise your script ended up hanging.

Fedora Linux, NTFS mount.