标签云

微信群

扫码加入我们

WeChat QR Code

How do I copy a file in Python?I couldn't find anything under os.


It seems that cp is not a system call and therefore does not belong to the os module.It is a shell command, so it is put in the shutil module.

2019年05月24日16分25秒

What is the difference between copy and copyfile?

2019年05月24日16分25秒

in copy(src, dst) the dst can be a directory.

2019年05月24日16分25秒

Note that not all metadata will be copied, depending on your platform.

2019年05月24日16分25秒

MonaJalal It's case sensitive, don't use capital F.

2019年05月24日16分25秒

Note that it is not an atomic operation. Take care using it in a threaded application.

2019年05月24日16分25秒

Could also add that copy and copy2 accept directory as destination contrary to copyfile which require to specify the complete name.

2019年05月23日16分25秒

jezrael Extracted from the doc: dst must be the complete target file name; look at shutil.copy() for a copy that accepts a target directory path. (here: docs.python.org/3/library/shutil.html#shutil.copyfile)

2019年05月24日16分25秒

Now this is just bad, bad API design.

2019年05月23日16分25秒

What does the "Dest dir OK" column indicate? Does it mean it doesn't check if the destination directory is OK before copying?

2019年05月24日16分25秒

KatieS "Dest dir OK" indicates whether it's okay to specify a directory as the destination, as opposed to a file. From the docs on shutil.copy(src, dst, *, follow_symlinks=True) (also applies to copy2): "If dst specifies a directory, the file will be copied into dst using the base filename from src."

2019年05月23日16分25秒

Although the documentation warns that copy2 does not preserve all metadata, this is just what I needed as I wanted the items you list.Thanks!

2019年05月23日16分25秒

Your answer is a bit deceptive. Your choice of words for shutil.copy2('/dir/file.ext', '/new/dir') suggests that copy2 will create a new directory. But in this case dir must already exist as a directory or else file.ext will be copied to a new file called dir.

2019年05月24日16分25秒

I am trying to randomly copy 100k files from 1 million files. copyfile is considerably faster than copy2

2019年05月23日16分25秒

am I correct to assume that shutil.copy2('/dir/file.ext', '/new/dir/') (with slash after the target path) will remove the ambiguity over whether to copy to a new file called "dir" or to put the file into a directory of that name?

2019年05月23日16分25秒

Vijay I believe this overhead is due to copying the metadata.

2019年05月24日16分25秒

I noticed a while ago that the module is called shutil (singular) and not shutils (plural), and indeed it is in Python 2.3. Nevertheless I leave this function here as an example.

2019年05月24日16分25秒

Copying a file's contents is a straightforward operation. Copying the file with its metadata is anything but straightforward, even more so if you want to be cross-platform.

2019年05月24日16分25秒

True. Looking at the shutil docs, the copyfile function also won't copy metadata.

2019年05月24日16分25秒

BTW, if you want to copy the contents between two file-like objects, there's shutil.copyfileobj(fsrc, fdst)

2019年05月23日16分25秒

Yes, I'm not sure why you wouldn't just copy the source of shutil.copyfileobj.Also, you don't have any try, finally to handle closing the files after exceptions.I would say however, that your function shouldn't be responsible for opening and closing the files at all.That should go in a wrapper function, like how shutil.copyfile wraps shutil.copyfileobj.

2019年05月23日16分25秒

Just curious, how did you generate that table?

2019年05月24日16分25秒

lightalchemist I just used vim as a scratchpad, copied the used unicode symbols from a wikipedia table and copied the result into the stackoverflow editor for final polishing.

2019年05月23日16分25秒

Using single-string commands is bad coding style (flexibility, reliability and security), instead use ['copy', sourcefile, destfile] syntax wherever possible, especially if the parameters are from user input.

2019年05月24日16分25秒

Why do you list so many bad alternatives to the shutil copy functions?

2019年05月23日16分25秒

shutil is built-in, no need to provide non-portable alternatives. The answer could be actually improved by removing the system dependent solutions, and after that removal, this answer is just a copy of the existing answers / a copy of the documentation.

2019年05月24日16分25秒

os.popen is deprecated for a while now. and check_output doesn't return the status but the output (which is empty in the case of copy/cp)

2019年05月24日16分25秒

Jean-FrançoisFabre thanks for taking your time to review! I will get back to this in three days (because of a deadline).

2019年05月24日16分25秒

this is not portable, and unnecessary since you can just use shutil.

2019年05月24日16分25秒

Even when shutil is not available - subprocess.run()(without shell=True!) is the better alternative to os.system().

2019年05月24日16分25秒

shutil is more portable

2019年05月24日16分25秒

subprocess.run() as suggested by maxschlepzig is a big step forward, when calling external programs. For flexibility and security however, use the ['cp', rawfile, 'rawdata.dat'] form of passing the command line. (However, for copying, shutil and friends are recommended over calling an external program.)

2019年05月24日16分25秒

try that with filenames with spaces in it.

2019年05月23日16分25秒

This reads the complete source file into memory before writing it back. Thus, this unnecessarily wastes memory for all but the smallest file copy operations.

2019年05月24日16分25秒

Is that true? I think .read() and .write() are buffered by default (at least for CPython).

2019年05月24日16分25秒

soundstripe, Of course this is true. The fact that the file object returned by open() does buffered IO, by default doesn't help you here, because read() is specified as: 'If n is negative or omitted, read until EOF.' That means that the read() returns the complete file content as a string.

2019年05月24日16分25秒

maxschlepzig I get your point and I admit I wasn't aware of it. The reason I provided this answer was in case someone wanted to do a simple file copy using only built-ins, without needing to import a module for it. Of course memory optimization should not be a concern if you want this option. Anyways thank you for clearing that out. I updated the answer accordingly.

2019年05月24日16分25秒

this seems a little redundant since the writer should handle buffering. for l in open('file.txt','r'): output.write(l) should work find; just setup the output stream buffer to your needs. or you can go by the bytes by looping over a try with output.write(read(n)); output.flush() where n is the number of bytes you'd like to write at a time.both of these also don't have an condition to check which is a bonus.

2019年05月24日16分25秒

Yes, but I thought that maybe this could be easier to understand because it copies entire lines rather than parts of them (in case we don't know how many bytes each line is).

2019年05月23日16分25秒

Very true. Coding for teachingand coding for efficiency are very different.

2019年05月24日16分25秒

owns To add to this question a year later, writelines() has shown slightly better performance over write() since we don't waste time consistently opening a new filestream, and instead write new lines as one large bytefeed.

2019年05月24日16分25秒

looking at the source - writelines calls write, hg.python.org/cpython/file/c6880edaf6f3/Modules/_io/bytesio.c.Also, the file stream is already open, so write wouldn't need to reopen it every time.

2019年05月23日16分25秒

This depends on the platform, so i would not use is.

2019年05月24日16分25秒

Such a call is unsecure. Please refere to the subproces docu about it.

2019年05月23日16分25秒

this is not portable, and unnecessary since you can just use shutil.

2019年05月24日16分25秒

Hmm why Python, then?

2019年05月23日16分25秒

Maybe detect the operating system before starting (whether it's DOS or Unix, because those are the two most used)

2019年05月24日16分25秒

The idea is nice and the code is beautiful, but a proper copy() function can do more things, such as copying attributes (+x bit), or for example deleting the already-copied bytes in case a disk-full condition is found.

2019年05月23日16分25秒

All answers need explanation, even if it is one sentence. No explanation sets bad precedent and is not helpful in understanding the program. What if a complete Python noob came along and saw this, wanted to use it, but couldn't because they don't understand it? You want to be helpful to all in your answers.

2019年05月23日16分25秒

Isn't that missing the .close() on all of those open(...)s?

2019年05月23日16分25秒

No need of .close(), as we are NOT STORING the file pointer object anywhere(neither for the src file nor for the destination file).

2019年05月24日16分25秒

AFAIK, it is undefined when the files are actually closed, SundeepBorra. Using with (as in the example above) is recommended and not more complicated. Using read() on a raw file reads the entire file into memory, which may be too big. Use a standard function like from shutil so that you and whoever else is involved in the code does not need to worry about special cases. docs.python.org/3/library/io.html#io.BufferedReader

2019年05月23日16分25秒

And then someone uses the code (accidentally or purposefully) on a large file… Using functions from shutil handles all the special cases for you and gives you peace of mind.

2019年05月23日16分25秒

at least it doesn't repeat the same solutions over and over again.

2019年05月23日16分25秒