I am running a C/C++ program on a solaris 5.8 machine. This parituclar application has a module which saves data to a file. The module uses fwrite() function to save data.
The fwrite function write about 500 MB of data to a file. The problem which I am facing is, the memory consumtion of the process increases during the fwrite but does not decrease once the fwrite is over.
I tried fflush(filepointer) after the fwrite but did not help.
The fwrite function writes all the data in once single fwrite() statement. I initially thought the increase in memory was because of this. So I tried writing data in smaller chunks of sizes 100 MB and 50 MB but still the memory utilization of the process does not decreases once the fwrite is over. As a result a lot of the main memory is being eaten by this process and thus making the system very slow.
That's right processes will not shrink when you're using the malloc library even indirectly. Maybe you can fork(). Let the child write the data and then exit.
------ jim mcnamara
You can try using the open() system call with O_SYNC flag, then call write() every 1MB of data.
O_SYNC turns off file buffering, in that every call to write waits until the underlying hardware has completed writing the data.
For fwrite(), setvbuf() and setbuf() do somewhat the same thing.
A word of warning: turning off buffering completely is a very bad idea in terms of performance. Especially on writing really big files. ------------------------------------------ 下面的这两种方法我都试过,使用子进程进行写解决不了问题。
今天自己编码试了一下,发现只有在open函数打开文件时,设置了O_SYNC标志,内存才不明显的被消耗. 即使在打开文件后再调用fcntl设置O_SYNC标志也是没有,内存同样的明显的消耗.个人觉得设置O_SYNC标志应该是可以完成同步I/O的,可以是我自已做的时候有某些地方没做到位的原因. DESCRIPTION The function fflush() forces a write of all user-space buffered data for the given output or update stream via the stream's underlying write function. The open status of the stream is unaffected. fflush只是将用户空间的缓冲区数据写到流里,这并不表示写到磁盘中,可能是弄到kernel block buffered中了.
这应该是一个操作系统调度问题,而不是一个编程问题,我在Linux内核2.6.9(内存256MB)下做了测试: 按照LZ的那种写文件方式,确实有这个现象,不过top监控是进程本身的虚拟内存和物理内存消耗很少,基本很稳定,但是vmstat跟踪系统状态结果如下 [root@localhost ~]# vmstat 60 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 224 116320 4780 35520 0 0 59 125 537 216 3 5 91 1 0 0 224 94288 5096 57304 0 0 0 371 1072 486 5 6 88 1 0 0 224 70192 5432 81148 0 0 0 418 1057 351 2 3 95 0 0 0 224 46312 5764 104736 0 0 0 412 1058 326 1 3 96 0 0 0 224 21892 6284 128656 0 0 8 430 1071 395 2 3 94 1 0 0 224 18156 5264 133056 0 0 47 414 1087 531 3 4 91 1 0 0 224 17312 5284 133816 0 0 1 421 1132 574 3 4 92 0 0 0 224 18620 5252 132288 0 0 0 417 1058 372 2 3 95 0 可以看到memory下free和cache的变化趋势(间隔60秒),也就是系统空闲内存都用来做cache了,以提高写入磁盘的性能 从http://www.faqs.org/docs/linux_admin/buffer-cache.html摘录部分段落也说明了该问题 If the cache is of a fixed size, it is not very good to have it too big, either, because that might make the free memory too small and cause swapping (which is also slow). To make the most efficient use of real memory, Linux automatically uses all free RAM for buffer cache, but also automatically makes the cache smaller when programs need more memory. Under Linux, you do not need to do anything to make use of the cache, it happens completely automatically. Except for following the proper procedures for shutdown and removing floppies, you do not need to worry about it.
(3)几个同步函数 1.fflush():The function fflush() forces a write of all user-space buffered data for the given output or update stream via the stream's underlying write function. The open status of the stream is unaffected. 我的理解是该函数将fwrite()里面的缓存内容强行压入底层(内核空间?cache?disk?)。至于压倒哪里就不明白了,我猜是压入到了 write()的内核空间。 2.fsync(): fsync() transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptor fd to the disk device (or other permanent storage device) where that file resides. 我的理解是该函数将write()里面内核空间(修改的部分)压入到磁盘设备。但是这里的buffer cache pages又不明白和内核空间的关系了