使用c++ fstream时遇到的一个问题与分析

概述

在用c++的fstream时遇到一个很奇怪的问题,记录和分析过程如下。

需求

两个文件,一个122k(file-bak),一个100k(file),filefile-bak的前100k的内容,需要从file-bak里读取后面的22k数据,写入file的后面,然后对比filefile-bak,数据应该一致。

c++实现

实现1

打开ofile后调用:ofile.seekp(102400, ios::beg)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <fstream>
#include <iostream>
using namespace std;

int main(int argc, char* argv[])
{
char buffer[22168];

ifstream ifile("file2-bak", std::ios_base::in | std::ios_base::binary);
ifile.seekg(102400);
ifile.read(buffer, 22168);
ifile.close();

ofstream ofile;
ofile.open("file2", std::ios_base::out | std::ios_base::binary);
ofile.seekp(102400, ios::beg); // seek to offset 102400 from begin
cout << ofile.tellp() << endl;
ofile.write(buffer, 22168);
ofile.close();

return 0;
}

编译运行如下:

1
2
3
4
5
6
7
8
9
10
11
# g++ -std=c++11 -o fstream_tst fstream_tst.cc
# ll -h file*
-rw-r--r-- 1 root root 100K Oct 18 17:53 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# ./fstream_tst
102400
# ll -h file*
-rw-r--r-- 1 root root 100K Oct 18 17:56 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# diff file1 file-bak
Binary files file and file-bak differ

结果异常

实现2

打开ofile后调用:ofile.seekp(0, ios::end)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <fstream>
#include <iostream>
using namespace std;

int main(int argc, char* argv[])
{
char buffer[22168];

ifstream ifile("file2-bak", std::ios_base::in | std::ios_base::binary);
ifile.seekg(102400);
ifile.read(buffer, 22168);
ifile.close();

ofstream ofile;
ofile.open("file2", std::ios_base::out | std::ios_base::binary);
ofile.seekp(0, ios::end); // seek to the end of the file
cout << ofile.tellp() << endl;
ofile.write(buffer, 22168);
ofile.close();

return 0;
}

编译运行如下:

1
2
3
4
5
6
7
8
9
10
11
# g++ -std=c++11 -o fstream_tst fstream_tst.cc
# ll -h file*
-rw-r--r-- 1 root root 100K Oct 18 18:01 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# ./fstream_tst
0
# ll -h file*
-rw-r--r-- 1 root root 22K Oct 18 18:05 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# diff file1 file-bak
Binary files file and file-bak differ

结果异常

很奇怪为啥 ofile.seekp(0, ios::end); 后 ofile.tellp() 的输出为0

实现3

打开ofile时指定:std::ios_base::out | std::ios_base::binary | std::ios_base::app

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <fstream>
#include <iostream>
using namespace std;

int main(int argc, char* argv[])
{
char buffer[22168];

ifstream ifile("file2-bak", std::ios_base::in | std::ios_base::binary);
ifile.seekg(102400);
ifile.read(buffer, 22168);
ifile.close();

ofstream ofile;
ofile.open("file2", std::ios_base::out | std::ios_base::binary | std::ios_base::app);
cout << ofile.tellp() << endl;
ofile.write(buffer, 22168);
ofile.close();

return 0;
}

编译运行如下:

1
2
3
4
5
6
7
8
9
10
11
# g++ -std=c++11 -o fstream_tst fstream_tst.cc
# ll -h file*
-rw-r--r-- 1 root root 100K Oct 18 18:12 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# ./fstream_tst
102400
# ll -h file*
-rw-r--r-- 1 root root 122K Oct 18 18:15 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# diff file1 file-bak
#

结果正常

c实现

代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <stdio.h>

int main(int argc, char* argv[])
{
char buffer[22168];

FILE *fpr = fopen("file-bak", "rb");
fseek(fpr, 102400, SEEK_SET);
fread(buffer, sizeof(char), 22168, fpr);
fclose(fpr);

FILE *fpw = fopen("file", "rwb+");
fseek(fpw, 102400, SEEK_SET);
fwrite(buffer, sizeof(char), 22168, fpw);
fclose(fpw);

return 0;
}

编译运行如下:

1
2
3
4
5
6
7
8
9
10
# gcc -o fstream_tst fstream_tst.c
# ll -h file*
-rw-r--r-- 1 root root 100K Oct 18 18:23 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# ./fstream_tst
# ll -h file*
-rw-r--r-- 1 root root 122K Oct 18 18:25 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# diff file1 file-bak
#

结果正常

分析

用c++代码的实现1分析如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# strace ./fstream_tst
...
open("file-bak", O_RDONLY) = 3
lseek(3, 102400, SEEK_SET) = 102400
read(3, "4318\n4319\n4320\n4321\n4322\n4323\n43"..., 22168) = 22168
close(3) = 0
open("file", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
lseek(3, 102400, SEEK_SET) = 102400
lseek(3, 0, SEEK_CUR) = 102400
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f81ce101000
write(1, "102400\n", 6102400
) = 6
writev(3, [{NULL, 0}, {"4318\n4319\n4320\n4321\n4322\n4323\n43"..., 22168}], 2) = 22168
close(3)

从上面可以看出编译后的可执行文件在打开要写入的文件时调用的是:open("file", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3

好奇怪为啥译后可执行文件执行open系统调用加了truncate标记!

网上搜索了好久,没找到相关的解释,于是翻看《C++ Primer》圣书,找到如下说明,也明白了原因:

  • 默认 ofstream流对象关联的文件将以out模式打开,使文件可写;以out模式打开的文件会被情况:丢弃该文件存储的所有数据;
  • 所以从效果上来看,为ofstream对象指定out模式等效于同时指定了out和trunc模式

针对分析到的原因,修改代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <fstream>
#include <iostream>
using namespace std;

int main(int argc, char* argv[])
{
char buffer[22168];

ifstream ifile("file2-bak", std::ios_base::in | std::ios_base::binary);
ifile.seekg(102400);
ifile.read(buffer, 22168);
ifile.close();

ofstream ofile;
ofile.open("file2", std::ios_base::in | std::ios_base::out | std::ios_base::binary);
ofile.seekp(102400, ios::beg); // seek to offset 102400 from begin
cout << ofile.tellp() << endl;
ofile.write(buffer, 22168);
ofile.close();

return 0;
}

编译运行如下:

1
2
3
4
5
6
7
8
9
10
11
# gcc -o fstream_tst fstream_tst.c
# ll -h file*
-rw-r--r-- 1 root root 100K Oct 19 12:03 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# ./fstream_tst
102400
# ll -h file*
-rw-r--r-- 1 root root 122K Oct 19 12:04 file
-rw-r--r-- 1 root root 122K Oct 18 15:18 file-bak
# diff file1 file-bak
#

结果正常

结论

经过这么大半天的测试分析查找,稍微明白了C++的fstream的使用,还真是不能直接想当然的拿C的那一套来推测。。。坑啊!

参考

http://www.runoob.com/cplusplus/cpp-files-streams.html
https://stackoverflow.com/questions/34238063/c-seekp0-iosend-not-working
https://stackoverflow.com/questions/29593940/why-is-an-fstream-truncated-when-it-is-opened-with-the-flags-iosate-and-ioso
https://social.msdn.microsoft.com/Forums/vstudio/en-US/8f121287-539f-4fe1-96b6-db3e5b9306f4/vc10-using-stdofstream-truncates-file-without-trunc

支持原创